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Abstract 

In this paper, we consider the problem of link scheduling in multi-hop wireless networks under 
general interference constraints. Our goal is to design scheduling schemes that do not use per-flow or 
per-destination information, maintain a single data queue for each link, and exploit only local information, 
while guaranteeing throughput optimality. Although the celebrated back-pressure algorithm maximizes 
throughput, it requires per-flow or per-destination information. It is usually difficult to obtain and maintain 
this type of information, especially in large networks, where there are numerous flows. Also, the back- 
pressure algorithm maintains a complex data structure at each node, keeps exchanging queue length 
information among neighboring nodes, and commonly results in poor delay performance. In this paper, 
we propose scheduling schemes that can circumvent these drawbacks and guarantee throughput optimaUty. 
These schemes use either the readily available hop-count information or only the local information for 
each link. We rigorously analyze the performance of the proposed schemes using fluid limit techniques 
via an inductive argument and show that they are throughput-optimal. We also conduct simulations to 
vaUdate our theoretical results in various settings, and show that the proposed schemes can substantially 
improve the delay performance in most scenarios. 

I. Introduction 

Link scheduling is a critical resource allocation functionality in multi-hop wireless networks, and also 
perhaps the most challenging. The seminal work of HI introduces a joint adaptive routing and scheduling 
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algorithm, called back-pressure, that has been shown to be throughput-optimal, i.e., it can stabilize the 
network under any feasible load. This paper focuses on the settings with fixed routes, where the back- 
pressure algorithm becomes a scheduling algorithm consisting of two components: flow scheduling and 
link scheduling. The back-pressure algorithm calculates the weight of a link as the product of the link 
capacity and the maximum "back-pressure" (i.e., the queue length difference between the queues at the 
transmitting nodes of this link and the next hop link for each flow) among all the flows passing through 
the link, and solves a MaxWeight problem to activate a set of non-interfering links that have the largest 
weight sum. The flow with the maximum queue length difference at a link is chosen to transmit packets 
when the link is activated. 

The back-pressure algorithm, although throughput-optimal, needs to solve a MaxWeight problem, 
which requires centralized operations and is NP-hard in general 121. To this end, simple scheduling 
algorithms based on Carrier Sensing Multiple Access (CSMA) 131-151 are developed to achieve the 
optimal throughput in a distributed manner for single -hop traffic, and are later extended to the case of 
multi-hop traffic Q leveraging the basic idea of back-pressure. 

However, the back-pressure-type of scheduling algorithms (including CSMA for multi-hop traffic) have 
the following shortcomings: 1) require per-flow or per-destination information, which is usually difficult 
to obtain and maintain, especially in large networks where there are numerous flows, 2) need to maintain 
separate queues for each flow or destination at each node, 3) rely on extensive exchange of queue length 
information among neighboring nodes to calculate link weights, which becomes the major obstacle to 
their distributed implementation, and 4) may result in poor overall delay performance, as the queue 
length needs to build up (creating the back-pressure) from a flow destination to its source, which leads to 
large queues along the route a flow takes O, Q. An important question is whether one can circumvent 
the above drawbacks of the back-pressure-type of algorithms and design throughput-optimal scheduling 
algorithms that do not require per-flow or per-destination information, maintain a small number of data 
queues (ideally, a single data queue for each link), exploit only local information when making scheduling 
decisions, and potentially have good delay performance. 

There have been some recent studies (e.g., 0, ll8l- |[T0l ) in this direction. A cluster-based back-pressure 
algorithm that can reduce the number of queues is proposed in ||9l, where nodes (or routers) are grouped 
into clusters and each node needs only to maintain separate queues for destinations within its cluster. In 
||6l, the authors propose a back-pressure policy making scheduling decisions in a shadow layer (where 
counters are used as per-flow shadow queues). Their scheme only needs to maintain a single First-In First- 
Out (FIFO) queue instead of per-flow queues for each link and shows dramatic improvement in the delay 
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performance. However, their shadow algorithm still requires per-flow information and constant exchange 
of shadow queue length information among neighboring nodes. The work in HI proposes to exploit 
the local queue length information to design throughput-optimal scheduling algorithms. Their approach 
combined with CSMA algorithms can achieve fully distributed scheduling without any information 
exchange. Their scheme is based on a two-stage queue structure, where each node maintains two types 
of data queues: per-flow queues and per-link queues. The two-stage queue structure imposes additional 
complexity, and is similar to queues with regulators lITTI . which have been empirically noted to have 
very large delays. In |[TOl . the authors propose a back-pressure algorithm that integrates the shortest path 
routing to minimize the average number of hops between each source and destination pair. However, 
their scheme further increases the number of queues by maintaining a separate queue {i, d, k} at each 
node i for the packets that will be delivered to destination node d within k hops. 

Although these algorithms partly alleviate the effect of the aforementioned disadvantages of the 
traditional back-pressure algorithms, to the best of our knowledge, no work has addressed all the aforemen- 
tioned four issues. In particular, a critical drawback of the earlier mentioned works is that they require per- 
flow or per-destination information to guarantee throughput optimality. In this paper, we propose a class 
of throughput-optimal schemes that can remove this per-flow or per-destination information requirement, 
maintain a single data queue for each link, and remove information exchange. As a by-product, these 
proposed schemes also improve the delay performance in a variety of scenarios. 

The main contributions of our paper are as follows. 

First, we propose a scheduling scheme with per-hop queues to address the four key issues mentioned 
earlier. The proposed scheme maintains multiple FIFO queues Qi^k at the transmitting node of each link 
/. Specifically, any packet whose transmission over link I is the k-th hop forwarding from its source node 
is stored at queue Qi^k- This hop-count information is much easier to obtain and maintain compared to 
per-flow or per-destination information. For example, hop-count information can be obtained using Time- 
To-Live or TTL information in packet headers. Moreover, as mentioned earlier, while the number of flows 
in a large network is very large, the number of hops is often much smaller. In the Internet, the longest 
route a flow takes typically has tens of hop^j, while there are billions of users or nodes |[T4l and thus the 
number of flows could be extremely large. A shadow algorithm similar to 161 is adopted in our framework, 
where a shadow queue is associated with each data queue. We consider the Max Weight algorithm based 

'in the Routing Information Protocol (RIP) 1121 . the longest route is limited to 15 hops. In general, an upper bound on the 
length of a route is 255 hops in the Internet, as specified by TTL in the Internet Protocol (IP) II13I . 
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on shadow queue lengths, and show that this per-Hop-Queue-based Max Weight Scheduler (HQ-MWS) is 
throughput-optimal using fluid limit techniques via a hop-by-hop inductive argument. For illustration, in 
this paper, we focus on the centralized MaxWeight-type of policies. However, one can readily extend our 
approach to a large class of scheduling policies (where fluid limit techniques can be used). For example, 
combining our approach with the CSMA-based algorithms of |[3l-||5l, one can completely remove the 
requirement of queue length information exchange, and develop fully distributed scheduling schemes, 
under which no information exchange is required. To the best of our knowledge, this is the first work 
that develops throughput-optimal scheduling schemes without per-flow or per- destination information in 
wireless networks with multi-hop traffic. In addition, we believe that using this type of per-hop queue 
structure to study the problem of link scheduling is of independent interest. 

Second, we have also developed schemes with per-link queues (i.e., a single data queue per link) 
instead of per-hop queues, extending the idea to per-Link-Queue-based Max Weight Scheduler (LQ-MWS). 
We propose two schemes based on LQ-MWS using different queueing disciplines. We first combine it 
with the priority queueing discipline (called PLQ-MWS), where a higher priority is given to the packet 
that traverses a smaller number of hops, and then prove throughput optimality of PLQ-MWS. It is of 
independent interest that this type of hop-count-based priority discipline enforces stability. This, however, 
requires that nodes sort packets according to their hop-count information. We then remove this restriction 
by combining LQ-MWS with the FIFO queueing discipline (called FLQ-MWS), and prove throughput 
optimality of FLQ-MWS in networks where flows do not form loops. We further propose fully distributed 
heuristic algorithms by combining our approach with the CSMA algorithms, and show that the fully 
distributed CSMA-based algorithms are throughput-optimal under the time-scale separation assumption. 

Finally, we show through simulations that the proposed schemes can significantly improve the delay 
performance in most scenarios. In addition, the schemes with per-link queues (PLQ-MWS and FLQ- 
MWS) perform well in a wider variety of scenarios, which implies that maintaining per-link queues not 
only simplifies the data structure, but also can contribute to scheduling efficiency and delay performance. 

The remainder of the paper is organized as follows. In Section |lll we present a detailed description of 
our system model. In Section |llll we prove throughput optimality of HQ-MWS using fluid limit techniques 
via a hop-by-hop inductive argument. We extend our ideas to show throughput-optimality of PLQ-MWS 
and FLQ-MWS in Section |IVl Further, we show that our approach combined with the CSMA-based 
algorithms leads to fully distributed scheduling schemes in Section |Vl We evaluate different scheduling 
schemes through simulations in Section |Vll Finally, we conclude our paper in Section I VII I 
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II. System Model 

We consider a multi-hop wireless network described by a directed graph Q = (V, S), where V denotes 
the set of nodes and £ denotes the set of links. Nodes are wireless transmitters/receivers and links are 
wireless channels between two nodes if they can directly communicate with each other. Let h{l) and e{l) 
denote the transmitting node and receiving node of link / = {b{l),e{l)) € £, respectively. Note that we 
distinguish links and We assume a time-slotted system with a single frequency channel. Let 

Q denote the link capacity of link /, i.e., link / can transmit at most ci packets during a time slot if none 
of the links that interfere with I is transmitting at the same time. We assume unit capacity links, i.e., 
c; = 1 for all I ^ £. A flow is a stream of packets from a source node to a destination node. Packets are 
injected at the source, and traverse multiple links to the destination via multi-hop communications. Let 
S denote the set of flows in the network. We assume that each flow s has a single, fixed, and loop-free 
route that is denoted by C{s) = (/f, • • • where the route of flow s has \C{s)\ hop-length from 

the source to the destination. If, denotes the A;-th hop link on the route of flow s, and | • | denotes the 
cardinality of a set. Let L^^^ = max^ \C{s)\ < oo denote the length of the longest route over all flows. 
Let H^j^ G {0, 1} be 1, if link / is the fc-th hop link on the route of flow s, and 0, otherwise. Note 
that the assumption of single route and unit capacity is only for ease of exposition, and one can readily 
extend the results to more general scenarios with multiple fixed routes and heterogeneous link capacities, 
applying the techniques used in this paper. We also restrict our attention to those links that have flows 
passing through them. Hence, without loss of generaUty, we assume that Y^k=i^ ^ik — 
I G £. 

The interference set of link I is defined as = {j ^ £ \ link j interferes with link I}. We consider 
a general interference model, where the interference is symmetric, i.e., for any l,j € (5, if / G 
then j G /(/). A schedule is a set of (active or inactive) links, and can be represented by a vector 
M G {0, where component Mi is set to 1 if link / is active, and if it is inactive. A schedule M is 
said to he feasible if no two links of M interfere with each other, i.e., I ^ /(j) for all /, j with Mi = 1 
and Mj = 1. Let Ai denote the set of all feasible schedules over £, and let Co{A4) denote its convex 
huU. 

Let Fs{t) denote the cumulative number of packet arrivals at the source node of flow s up to time slot 
t. We assume that packets are of unit length. We assume that each arrival process Fs{t) — Fs{t — 1) is 
an irreducible positive recurrent Markov chain with countable state space, and satisfies the Strong Law 
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of Large Numbers (SLLN): That is, with probabiUty one, 

limt^oo = As, (1) 

for each flow s € 5, where is the mean arrival rate of flow s. We let A = [As] denote the arrival 
rate vector. Also, we assume that the arrival processes are mutually independent across flows. (This 
assumption can be relaxed as in (151 .) 

As in ifTSl . a stochastic queueing network is said to be stable, if it can be described as a discrete- 
time countable Markov chain and the Markov chain is stable in the following sense: The set of positive 
recurrent states is non-empty, and it contains a finite subset such that with probability one, this subset is 
reached within finite time from any initial state. When all the states communicate, stability is equivalent 
to the Markov chain being positive recurrent |[T6l . We define the throughput region of a scheduling policy 
as the set of arrival rate vectors for which the network is stable under this policy. Further, we define 
the optimal throughput region (or stability region) as the union of the throughput regions of all possible 
scheduling poUcies, including the offline policies HI. We denote by A* the optimal throughput region, 
whereby A* can be represented as 

A* = {A j for some (p G Co{M), 

(2) 

Es Ek H^k^s < (Pi for aU links I G £}. 
An arrival rate vector is strictly inside A*, if the inequalities above are all strict. 

Throughout the paper, we let {z)~^ = m.ax{z, 0) denote the larger value between z and 0. 

III. Scheduling with Per-hop Queues 

In this section, we propose scheduling policies with per-hop queues and shadow algorithm. We will later 
extend our ideas to developing schemes with per-link queues in Section JV] We describe our scheduling 
schemes using the centralized Max Weight algorithm for ease of presentation. Our approach combined with 
the CSMA algorithms can be extended to develop fully distributed scheduling algorithms in Section jV] 

A. Queue Structure and Scheduling Algorithm 

We start with the description of queue structure, and then specify our scheduling scheme based on 
per-hop queues and a shadow algorithm. We assume that, at the transmitting node of each link I, a single 
FIFO data queue Qi^k is maintained for packets whose k-th hop is link I, where 1 < A; < L""^^. Such 
queues are called per-hop queues. For notational convenience, we also use Qi^k{t) to denote the queue 
length of Qi^j; at time slot t. Let 11; ^(t) denote the service of (5;,^ at time slot t, which takes a value 
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of Q (i.e., 1 in our setting), if queue (5;,/t is active, or 0, otherwise. Let Di^k{t) denote the cumulative 
number of packet departures from queue Qi^k up to time slot t, and let ^;,fc(t) = Di^k{t) — Di^k{t — 1) 
be the number of packet departures from queue Qi^k at time slot t. Since a queue may be empty when 
it is scheduled, we have ^'i,fc(t) < ^(t) for all time slots t > 0. Let Us,k{t) denote the cumulative 
number of packets transmitted from the {k — l)-st hop to the A;-th hop for flow s up to time slot t for 
1 < A; < C{s), where we set Us^i{t) = Fs{t). And let Ai^k{t) be the cumulative number of aggregate 
packet arrivals (including both exogenous arrivals and arrivals from the previous hops) at queue Qi^k up 
to time slot t. Then, we have Ai^}^{t) = k^s,k{t)^ and in particular, Ai i{t) = J2s ^ii^s{t). Let 

Pi,k{t) = Ai^k{t) — Ai^k{t — 1) denote the number of arrivals for queue Qi^k at time slot t. We adopt the 
convention that Ai^k{0) = and Di^k{0) = for all / G and I < k < L"^^^. The queue length evolves 
as 

QiAt) = Qi,ki^) + Akit) - Di^t). (3) 

For each data queue Qi^k, we maintain a shadow queue Qi^k, and let Qi^t) denote its queue length 
at time slot t. The arrival and departure processes of the shadow queues are controlled as follows. 
We denote by Ai k{t) and Di j^{t) its cumulative amount of arrivals and departures up to time slot t, 
respectively. Also, let fVi^t), Pi^t) = Ai^t) - Ai^t - 1) and ^i,kit) = bi,k{t) - Di^t - 1) denote 
the amount of service, arrivals and departures of queue Q«,fc at time slot t, respectively. Likewise, we 
have ^i^t) < tii^t) for t > 0. We set by convention that, Ai^^)) = and Di^O) = for all queues 
Qi^k- The arrivals for shadow queue Q«,fc are set to (1 + e) times the average amount of packet arrivals 
at data queue Q«,fc up to time slot t, i.e., 

Pi^,{t) = {l + e)^, (4) 

where e > is a sufficiently small positive number such that (1 + e)A is also strictly inside A* given 
that A is strictly inside A*. Then, the shadow queue length evolves as 

= Qi,k{^) + - Di^t). (5) 

Using these shadow queues, we determine the service of both data queues and shadow queues using 
the following Max Weight algorithm. 

Per-Hop-Queue-based Max Weight Scheduler (HQ-MWS): At each time slot t, the scheduler serves 



data queues Qi^k'{i) for ^ ^ -^*' where 

k*{l) G argmax^ for each link / G £, (6) 

M* G argmax^,fg_v! T^ieS Ql,k'{l){t) ' Mi. (7) 
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In other words, we set the service of data queue as Ili^k{t) = 1 if / € M* and k = k*{l), and = 

otherwise. We also set the service of shadow queues as Iii^k{t) = ^(t) for all I and k. 

Remark: The algorithm needs to solve a Max Weight problem based on the shadow queue lengths, and 
ties can be broken arbitrarily if there is more than one queue having the largest shadow queue length at a 
link or there is more than one schedule having the largest weight sum. Note that we have IT; ^(t) = n;^fc(t) 
under this scheduling scheme, for all links / G E and 1 < k < L™"^ and for all time slots t > 0. Once a 
schedule M* is selected, data queues Qi^k'ii) for links I with A/^* = 1 are activated to transmit packets 
if they are non-empty, and shadow queues Qi,k*(i) "transmit" shadow packets as well. Note that shadow 
queues are just counters. The arrival and departure process of a shadow queue is simply an operation of 
addition and subtraction, respectively. 

B. Throughput Optimality 

We present the main result of this section as follows. 

Proposition 1: HQ-MWS is throughput-optimal, i.e., the network is stable under HQ-MWS for any 
arrival rate vector A strictly inside A*. 

We prove the stability of the network in the sense that the underlying Markov chain (whose state 
accounts for both data queues and shadow queues; see Appendix |A] for the detailed state description) is 
stable under HQ-MWS, using fluid limit techniques |[T5l . ifTTl . We provide the proof of Proposition [T] in 
Appendix |Al and discuss the outline of the proof as follows. 

Note that the shadow queues serve only single-hop traffic, i.e., after packets in the shadow queues are 
served, they leave the system without being transmitted to another shadow queue. We also emphasize 
that the single-hop shadow traffic gets smoothed under the arrival process of (01), and in the fluid limits 
(which will be formally established in Appendix [A]), after a finite time, the instantaneous shadow arrival 
rate is strictly inside the optimal throughput region A* with small enough e > 0. Then, using the standard 
Lyapunov approach, we can show the stability for the sub-system consisting of shadow queues. 

Now, we consider the data queues in the fluid limits starting from the first hop data queue for each 
link / € £. Since the arrival process of data queue i satisfies the SLLN, the instantaneous arrival of 
shadow queue Qi,! will be equal to (1 + e) ^ii^s- This implies that the service rate of shadow queue 
Qi^i is no smaller than (1 + e) J^s-^ii^s due to the stability of shadow queues. Then, the service rate 
of data queue is also no smaller than (1 + e) J2s ^ii^s because 11; ^(t) = Ili k{t) under HQ-MWS. 
Since the arrival rate of data queue Qi i is Hi^Xg, the service rate is strictly greater than the arrival 
rate for establishing its stability. Using this as an induction base, we can show the stability of data 
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queues via a hop-by-hop inductive argument. This immediately impUes that the fluid Hmit model of the 
joint system is stable under HQ-MWS. 

Although our proposed scheme shares similarities with fSl, ||8l, it has important differences. First, in [6l, 
per-flow information is still required by their shadow algorithm. The shadow packets are injected into the 
network at the sources, and are then "transmitted" to the destinations via multi-hop communications. Their 
scheme strongly relies on the information exchange of shadow queue lengths to calculate the link weights. 
In contrast, we take a different approach for constructing the instantaneous arrivals at each shadow 
queue according to dH) that is based on the average amount of packet arrivals at the corresponding data 
queue. This method of injecting shadow packets allows us to decompose multi-hop traffic into single-hop 
traffic for shadow queues and exploit only local information when making scheduling decisions. Second, 
although the basic idea behind the shadow arrival process of dUl is similar to the service process of the 
per-flow queues in |[8l, the scheme in lUl requires per-flow information and relies on a two-stage queue 
architecture that consists of both per-flow and per-link data queues. In contrast, our scheme needs only 
per-hop (and not per-flow) information, i.e., the number of hops each packet has traversed, completely 
removing per-flow information and per-flow queues. This simplification of required information and data 
structure is critical, due to the fact that the maximum number of hops in a network is usually much 
smaller than the number of flows in a large network. For example, in the Internet, the longest route a 
flow takes typically has tens of hops, while there are billions of nodes and thus the number of flows 
could be extremely large. 

Note that the hop-count in our approach is counted from the source. Such per-hop information is easy 
to obtain (e.g., from Time-to-Live or TTL information in the Internet and ad hoc networks). At each 
link, packets with the same hop-count (from the source of each packet to the link) are kept at the same 
queue, regardless of sources, destinations, and flows, which significantly reduces the number of queues. 
In Section |IVl we extend our approach to the schemes with per-link queues, and further remove even 
the requirement of per-hop information. 

IV. Scheduling with Per-link Queues 

In the previous section, we show that per-hop-queue-based Max Weight scheduler (HQ-MWS) achieves 
optimal throughput performance. In this section, we extend our ideas to developing schemes with per- 
link queues. To elaborate, we show that per-link-queue-based MaxWeight scheduler, when associated 
with priority or FIFO queueing discipline, can also achieve throughput optimaUty. 
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A. MaxWeight Algorithm with Per-link Queues 

We consider a network where each Unk / has a single data queue Qi. Let Qi{t), Ai{t), Di{t), Iii{t), 
'^i{t) and Pi{t) denote the queue length, cumulative arrival, cumulative departure, service, departure 
and arrival at the data queue Qi, respectively. Also, we maintain a shadow queue Qi associated with 
each Qi, and let Qi{t), Ai{t), Di{t), Ili{t), "^i{t) and Pi{t) denote the queue length, cumulative arrival, 
accumulative departure, service, departure and arrival at the shadow queue Qi, respectively. Similar to 
(01) for per-hop shadow queues, we control the arrivals to the shadow queue Qi as 

P,(t) = (l + e)^, (8) 

where e > is a sufficiently small positive number. 

Next, we specify the MaxWeight algorithm with per-link queues as follows. 
Per-Link-Queue-based MaxWeight Scheduler (LQ-MWS): At each time slot t, the scheduler serves 
links in M* (i.e., 11; (t) = 1 for Z e M*, and 11; (i) = otherwise), where 

M* G argmaxj,^g_^ J2i&£ W) ' Mi- 

Also, we set the service of shadow queues as Iii{t) = 11; (t) for all /. 

Similar as in HQ-MWS, the shadow traffic under LQ-MWS gets smoothed due to the shadow arrival 
assignment of ([8]), and the instantaneous arrival rate of shadow queues can be shown to be strictly inside 
the optimal throughput region A*. Hence, we show in Lemma [20l (see Appendix iDl) that the fluid limit 
model for the sub-system consisting of shadow queues is stable under LQ-MWS, using the standard 
Lyapunov approach and following the same line of analysis for HQ-MWS. 

B. LQ-MWS with Priority Discipline 

We develop a scheduling scheme by combining LQ-MWS with priority queueing discipline, called 
PLQ-MWS. Regarding priority of packets at each per-link queue, we define hop-class as follows: A 
packet has hop-class-Zc, if the link where the packet is located is the k-th hop from the source of the 
packet. When a link is activated to transmit packets, packets with a smaller hop-class will be transmitted 
earlier; and packets with the same hop-class will be transmitted in a FIFO fashion. 

Proposition 2: PLQ-MWS is throughput-optimal. 

We provide the outline of the proof and refer to Appendix |E] for the detailed proof. Basically, we follow 
the line of analysis for HQ-MWS using fluid limit techniques and induction method. Since a link transmits 
packets according to their priorities (i.e., hop-classes or hop-count from their respective source nodes), we 
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can view packets with hop-class-A; at link Z as in a sub-queue ^ (similar to the per-hop queues under 
HQ-MWS). Now, we consider the data queues in the fluid limits. Since the exogenous arrival process 
satisfies the SLLN, the instantaneous arrival to shadow queue Qi will be at least (1 + e) -f^f i-^s for 
each link / G £. This implies that the service rate of shadow queue Qi is no smaller than (1+e) Hf^Xg 
due to the stability of the shadow queues (see Lemma l20l in Appendix IdI). Then, the service rate of sub- 
queue Qi^i is also no smaller than (1 + e) Hf^Xg, because: 1) Tli{t) = n/(t) under PLQ-MWS; and 
2) the highest priority is given to sub-queue Qi^i when link / is activated to transmit. Since the arrival 
rate of sub-queue Qi i is H^^Xs, the service rate is strictly greater than the arrival rate for sub-queue 
Qi^i, establishing its stability. Similarly, we can show that the hop-class-j sub-queues are stable for all 
j < k + 1, given the stability of the hop-class-/ sub-queues for all / < k. Therefore, we can show the 
stability of the data queues via a hop-by-hop inductive argument. This immediately implies that the fluid 
limit model of the joint system is stable under PLQ-MWS. 

We emphasize that a "bad" priority discipline may cause instability (even in wireline networks). See 
mi, lfT9l for two simple counterexamples showing that in a two-station network, a static priority discipline 
that gives a higher priority to customers with a larger hop-count, may result in instability. (Interested 
readers are also referred to Chapter 3 of |[T6i for a good summary of the instability results.) The key 
intuition of these counterexamples is that, by giving a higher priority to packets with a larger hop-count 
in one station, the priority discipline may impede forwarding packets with a smaller hop-count to the 
next-hop station, which in turn starves the next-hop station. On the other hand, PLQ-MWS successfully 
eliminates this type of inefficiency by giving a higher priority to the packets with a smaller hop-count, 
and continues to push the packets to the following hops. 

Note that PLQ-MWS is different from HQ-MWS, although they appear to be similar. HQ-MWS makes 
scheduling decisions based on the queue length of each per-hop shadow queue. This may result in a waste 
of service if a per-hop queue is activated but does not have enough packets to transmit, even though 
the other per-hop queues of the same link have packets. In contrast, PLQ-MWS makes decisions based 
on the queue length of each per-link shadow queue and allows a link to transmit packets of multiple 
hop-classes, avoiding such an inefficiency. The performance difference due to this phenomenon will be 
illustrated through simulations in Section |Vll Furthermore, the implementation of PLQ-MWS is easier 
than HQ-MWS, since PLQ-MWS needs to maintain only one single shadow queue per link. 

Another aspect of PLQ-MWS we would like to discuss is about the hop-count-based priority discipline 
in the context of multi-class queueing networks (or wireline networks). In operations research, stability 
of multi-class queueing networks has been extensively studied in the literature (e.g., see |_16,| and the 
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references therein). To the best of our knowledge, however, there is very Umited work on the topic of 
"priority enforces stabiUty" |[20l - |[22l . In ll20l . 11211 . the authors obtained sufficient conditions (based 
on linear or piecewise linear Lyapunov functions) for the stability of a multiclass fluid network and/or 
queueing network under priority disciplines. However, to verify these sufficient conditions relies on 
verifying the feasibility of a set of inequalities, which in general can be very difficult. The most related 
work to ours is |[22l . There, the authors showed that under the condition of "Acyclic Class Transfer", 
where customers can switch classes unless there is a loop in class transfers, a simple priority discipline 
stabilizes the network under the usual traffic condition (i.e., the normalized load is less than one). Their 
priority discipline gives a higher priority to customers that are closer to their respective sources. 

Interestingly, our hop-count-based priority discipline (for wireline networks) is similar to the discipline 
proposed in |[22l . However, there is a major difference in that while Il22l studies stability of wireline 
networks (without link interferences) under the usual traffic condition, we consider stability of wireless 
networks with interference constraints that impose the (link) scheduling problem, which is much more 
challenging. In wireless networks, the service rate of each link depends on the underlying scheduling 
scheme, rather than being fixed as in wireline networks. Hence, the difficulty is to establish the usual traffic 
condition by designing appropriate wireless scheduling schemes. In this paper, we develop PLQ-MQS 
scheme and show that the usual traffic condition and then stability can be established via a hop-by-hop 
inductive argument under the PLQ-MWS scheme. 

C. LQ-MWS with FIFO Discipline 

In this section, we develop a scheduling scheme, called FLQ-MWS, by combining the LQ-MWS 
algorithm developed in Section IIV-AI with FIFO queueing discipline (instead of priority queueing disci- 
pline), and show that this scheme is throughput-optimal if flows do not form loops. We emphasize that 
FLQ-MWS requires neither per-flow information nor hop-count information. 

To begin with, we define a positive integer r{l) as the rank of link / € 8, and call R{£) = / € £) 
a ranking of £. Recall that C{s) denotes the loop-free route of flow s. In the following, we prove a key 
property of the network where flows do not form loops, which will be used to prove the main results in 
this section. 

Lemma 3: Consider a network Q = (V, £) with a set of flows S, where the flows do not form loops. 
There exists a ranking R{£) such that the following two statements hold: 

1) For any flow s € 5, the ranks are monotonically increasing when one traverses the links of flow 
s from II to i.e., r{l^) < r(/|_^^) for all 1 < i < \C{s)\. 
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2) The packet arrivals at a link are either exogenous, or forwarded from links with a smaller rank. 

We provide the proof of Lemma |3] in Appendix IB Note that such a ranking with the monotone property 
exists because the flows do not form a loop. In contrast, it is clear that if flows form a loop, then such 
a ranking does not exist. Two examples of the networks where flows do not form loops are provided in 
Figs. |5(b)| and 5(c) and an example of the network where flows do form a loop is provided in Fig. |5(a)| 



Note that the ranking is only for the purpose of analysis and plays a key role in proving the system 
stability under FLQ-MWS, while it will not be used in the actual link scheduling algorithm. 

Now, we give the main results of this section in the following proposition. 

Proposition 4: FLQ-MWS is throughput-optimal in networks where flows do not form loops. 

We omit the detailed proof and refer to Appendix IB In the following, we provide the outline of the 
proof. Motivated by Lemma |3l we extend our analysis for HQ-MWS (or PLQ-MWS). Compared to the 
PLQ-MWS algorithm, there are differences only in the operations with data queues, and the underlying 
LQ-MWS algorithm remains the same. Thus, the shadow queues will exhibit similar behaviors, and the 
fluid limit model for the sub-system of shadow queues is stable under FLQ-MWS (see Lemma |20] in 
Appendixinil. Also, note that Lemma|3]implies that given the qualified ranking (without loss of generality, 
assuming that the smallest rank is 1), the packet arrivals at links with rank 1 are all exogenous, then 
following a similar argument in the proof of Proposition [T] we can prove the stability of the corresponding 
data queues by showing that the instantaneous arrival rate is less than the instantaneous service rate. Since 
Lemma |3] also implies that the packet arrivals at links with rank 2 are either exogenous or from links 
with rank 1, we can similarly show the stability of links with rank 2. Repeating the above argument, we 
can prove the stability of all data queues by induction, which completes the proof of Proposition |4| 

Corollary 5: FLQ-MWS is throughput-optimal in tree networks. 

The above corollary follows immediately from Proposition |4l because a tree network itself does not 
contain a cycle of links and flows are all loop-free. 

V. Extension to CSMA-based Distributed Algorithms 

In this section, we employ CSMA techniques to develop fully distributed throughput-optimal scheduling 
schemes for multi-hop traffic. We consider per-link-queue-based schemes combined with the CSMA-based 
scheduling of H. 
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A. Basic Scheduling Algorithm 

We start with description of basic scheduling algorithm based on CSMA. As in lH, we divide each 
time slot t into a control slot and a data slot, where the control slot is further divided into W mini-slots. 
The purpose of the control slot is to generate a collision-free transmission schedule M{t) G To 
this end, the distributed CSMA scheduling selects at each time slot a set of links that form a feasible 
schedule. Such a schedule is called a decision schedule and used to change links' state (between active 
and inactive). Let a{t) denote a decision schedule at time slot t. 

Let M-o C M denote the set of possible decision schedules under our CSMA-based algorithm. A 
decision schedule is selected through a randomized procedure, e.g., a decision schedule a{t) e A4o is 
selected with a positive probability a{a{t)) satisfying that X]CT(t)eAlo '-'^('^(^)) ~ ^- Based on the decision 
schedule, the schedule for actual data transmission is determined as follows. For each link I £ a{t), if no 
link in its interfering neighbors was active at time slot t — 1, then the state of link / becomes active 
with probability pi (which will be specified later) and inactive with probability pi = 1 — pi during time 
slot t. If at least one link in /(/) was active in the previous time slot, then link / remains inactivqj in 
the current data slot. Any link /' ^ a{t) will have its state unchanged from the previous time slot. Since 
the current state M{t) depends only on the previous state M{t — I) and the randomly selected decision 
schedule o-(i), the transmission schedule M{t) evolves as a discrete-time Markov chain (DTMC). Our 
basic scheduling algorithm is very similar to that of |4i]. The key difference is that the link activation 
probability is based on the shadow queue lengths instead of the data queue lengths. We refer the readers 
to for the detailed operations of the CSMA-based algorithms. 



B. Distributed Implementation with Per-link Queues 

In this section, we describe our distributed CSMA-based scheduling scheme with per-link queues, 
called LQ-CSMA. The LQ-CSMA algorithm can be combined with priority or FIFO queueing discipline 
to develop fully distributed scheduling schemes. 

We use the system settings and notations of per-link-queue structure as in Section |IVl We also control 
the shadow arrivals as ([D. As in ||4l, we set link activation probability pi = -^7x17^, where wi{t) is the 
weight of link /. We begin with defining a class of functions that will be used for weight calculation. As 
in H, |[23l . let B denote the set of functions g{-) : [0, 00] — > [0, 00] that satisfy the following conditions: 

1) g{x) is a non-decreasing and continuous function with Ymix^ao g{x) = 00. 

^In the previous data slot, link / must be inactive since the schedule must be feasible. 
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2) Given any Mi > 0, M2 > and < e < 1, there exists a i? < 00, such that for all x > B, v/e 
have (1 - e)g{x) < g{x - Mi) < g{x + M2) < (1 + e)g{x). 
For example, functions g{x) = log{x + 1), g{x) = x" with a > 0, and g{x) = belong to B, while 
g{x) = does not. Similar to Chapter 4 of 1241, to guarantee the existence of the fluid limit, we further 
define C as a subset of B such that g{0) = 0, and for any (xi, . . . , x„) and (yi, . . . , y„) in [0, 00]" and 
for any r] G [0, 1], 

^g{xi) > ri^g{yi) =^ > v'^diryi), for all r > 0. (9) 

i i i i 

For example, g{x) = x"' with a > is in C 

We set the weight of link / € <S at time slot t as wi{t) = gi{Qi{t)), where gi G C. We highlight the 
differences from the original CSMA-based scheduling schemes as follows: i) the link weight is calculated 
by a function in set C instead of B. This restriction is necessary to apply the fluid limit techniques; ii) the 
shadow queue length Qi{t) is used for the weight calculation instead of the data queue length Qi{t). The 
following scheduling scheme is an extension of per-link-queue-based scheduling schemes to CSMA-based 
algorithm. 

Per-Link-Queues-and-CSMA-based Scheduling Algorithm (LQ-CSMA): 

Let pi = -^^x^T^' where wi{t) = gi{Qi{t)) is an appropriate function of the shadow queue length 
of link / as shown above. At the beginning of each time slot, each link / randomly selects a backoff 
time among {0, 1,2, • • • ,W — 1}, where W denotes the contention window size. Link / will send an 
INTENT message to announce its decision of attempting channel when this backoff time expires, unless 
an interfering link in /(/) sent an INTENT message in an earlier mini-slot. The details are shown in 
Algorithm [1] which is similar to the Q-CSMA algorithm of IH, except that the activation probability pi 
is now determined based on the shadow queue lengths. 

Remark: The weight function gi{Qi{t)) needs to be appropriately chosen such that the DTMC of the 
transmission schedules converge faster compared to the dynamics of the link weights. For examplj^, 
dliQiit)) = OiQi{t) with a small a is suggested as a heuristic to satisfy the time-scale separation 
assumption in |!3l| and gi{Qi{t)) = loglog{Qi{t) + e) is used in the proof of throughput optimality 
in im to essentially separate the time scales. In addition, it has been reported in H that the weight 
function gi{Qi(t)) = log{aQi{t)) with a small a gives the best empirical delay performance. In this 

^In Ol-dl, the weight function gi is a function of the queue length Qi{t) rather than Qi{t). 
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Algorithm 1 LQ-CSMA (at time slot t) 

1) Link / selects a random (integer) backoff time Bi uniformly in [0, — 1] and waits for Bi control 

mini-slots. 

2) IF link / hears an INTENT message from a link in I{1) before the {Bi + l)-st control mini-slot, 
/ will not be included in a{t) and will not transmit an INTENT message anymore. Link / will set 

Mi{t) = Mi{t-l). 

3) IF link / does not hear an INTENT message from any link in /(/) before the {Bi + l)-st control 
mini-slot, it will send (broadcast) an INTENT message to all links in /(/) at the beginning of the 
{Bi + l)-st control mini-slot. 

- If there is a collision (i.e., if there is another link in /(/) transmitting an INTENT message in 
the same mini-slot), link / will not be included in a{t) and will set Mi{t) = Mi{t — 1). 

- If there is no collision, link / will be included in a{t) and decide its state as follows: 

if no links in /(/) were active in the previous data slot then 
Mi{t) = 1 with probability pi, < pi < 1; 
Mi{t) = with probability pi = 1 — pi. 

else 

Mi{t)=0. 
end if 

4) IF Mi{t) = 1, link I will transmit a packet in the data slot, and will set Qi{t) = {Qi{t) - 1)+. 



paper, we make the time-scale separation assumption as in lO, lH and assume that the DTMC is in the 
steady state at every time slot. 

Applying Lemma 3 of lH, we can show that the transmission schedule M{t) produced by LQ-CSMA 
is feasible and the decision schedule a satisfies UcreXo — ^ when W > 2. Applying Proposition 1 
of iH, we can obtain that the DTMC of the transmission schedules is irreducible and aperiodic (and 
reversible in this case), and has the following product-form stationary distribution: 

M^) = ^n^6M|, (10) 

Then from Proposition 2 of m, we can obtain the following lemma. 

Lemma 6: If the window size W > 2, LQ-CSMA has the product-form distribution given by dTOl ). 
Further, given any ( and 7, < C,7 < 1, there exists a > such that: at any time slot t, with 
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probability greater than 1 — C> LQ-CSMA chooses a schedule M (t) G A4 that satisfies 

V wi{t) ■ Mi{t) > (1 - 7) max V wi{t) ■ Mi (12) 
l&£ l&£ 

whenever > Qb- 

We omit the proof and refer interested readers to 01 (Lemma 3, Propositions 1 and 2) for details. 

Note that we have (l80l ) since Lemma [T9l also holds under LQ-CSMA. Applying Lemma|6]and following 
the same line of analysis for the proof of Lemma [151 we can easily show that the sub-system of shadow 
queues q is stable under LQ-CSMA in the fluid limit model. 

Lemma 7: Given any C, and 7, < 0,7 < 1, with probability greater than 1 — 9, the sub-system of 
shadow queues q operating under LQ-CSMA satisfies that: For any C > 0, there exists a finite r4 > 
such that, for any fluid model solution with ||g(0)|| < 1, we have 

\\q{t)\\ < C, for aU t > T4, (13) 

for any arrival rate vector strictly inside (1 — 7) A*. 
The proof is provided in Appendix H 

The LQ-CSMA algorithm combined with priority queueing discipline and FIFO queueing discipline is 
called PLQ-CSMA and FLQ-CSMA, respectively. We present the main results of this section as follows. 
Proposition 8: PLQ-CSMA is throughput-optimal. 

Proposition 9: FLQ-CSMA is throughput-optimal in networks where flows do not form loops. 

Since the fluid limit model for the sub-system of shadow queues q is stable from Lemma |7J the results 
of Propositions [8] and |9] follow the same line of analysis for the proof of Propositions |2] and IH respectively. 
We omit the proofs. 

VL Numerical Results 

In this section, we evaluate different scheduling schemes through simulations. We compare scheduling 
performance of HQ-MWS, PLQ-MWS, FLQ-MWS with the original back-pressure (BP) algorithm under 
the node-exclusiveii interference model. Note that we focus on the node-exclusive interference model only 
for the purpose of illustration. Our scheduling schemes can be applied to general interference constraints 
as specified in Section [III We will first focus on a simple linear network topology to illustrate the 
advantages of the proposed schemes, and further validate our theoretical results in a larger and more 

■^It is also called the primary or 1-hop interference model, where two links sharing a common node cannot be activated 
simultaneously. It has been known as a good representation for Bluetooth or FH-CDMA networks |21 . 
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(a) Linear network topology with ten links 

250, : ^ 1 




0.1 0.2 0.3 0.4 0.5 

Offered load X 



(b) Average delay 

Fig. 1. Performance of BP, HQ-MWS, PLQ-MWS and FLQ-MWS in a linear network topology (e = 0.005). 



realistic grid network topology. The impact of the parameter e on the scheduling performance will also 
be explored and discussed. 

First, we evaluate and compare the scheduling performance of HQ-MWS, PLQ-MWS, FLQ-MWS and 
the back-pressure algorithm in a simple linear network that consists of 11 nodes and 10 links as shown 
in Fig. |l(a)[ where nodes are represented by circles and links are represented by dashed lines with link 
capacity, respectively. We establish 10 flows that are represented by arrows, where each flow i is from 
node 1 to node i + 1 via all the nodes in-between. We consider uniform traffic where all flows have 
packet arrivals at each time slot following Poisson distribution with the same mean rate A > 0. We run 
our simulations with changing traffic load A. Clearly, in this scenario, any traffic load with A < 0.5 is 
feasible. We use e = 0.005 for HQ-MWS, PLQ-MWS and FLQ-MWS. We evaluate the performance 
by measuring average packet delays (in unit of time slot) over all the delivered packets (that reach their 
respective destination nodes) in the network. 

Fig. |l(b)| plots the average delays under different offered loads to examine the performance limits of 
different scheduling schemes. Each result represents a simulation run that lasts for 10^ time slots. Since 
the optimal throughput region A* is defined as the set of amval rate vectors under which queue lengths 
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(a) Linear network topology with ten links 
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(b) Average delay 

Fig. 2. Performance of BP, HQ-MWS, PLQ-MWS and FLQ-MWS in a linear network topology (e = 0.005). 



and thus delays remain finite, we can consider the traffic load, under which the average delay increases 
rapidly, as the boundary of the optimal throughput region. Fig. |l(b)| shows that all schemes achieve the 
same boundary (i.e., A < 0.5), which supports our theoretical results on throughput optimality. Moreover, 
all the three proposed schemes achieve substantially better delay performance than the back-pressure 
algorithm. This is because under the back-pressure algorithm, the queue lengths have to build up along 
the route a flow takes from the destination to the source, and in general, earlier hop link has a larger 
queue length. This leads to poor delay performance especially when the route of a flow is lengthy, which 



is the case in Fig. 1(a) Note that in this specific scenario, there is only one per-hop queue at each link 
under HQ-MWS. Hence, HQ-MWS is equivalent to PLQ-MWS and FLQ-MWS in this scenario, which 
explains why the three proposed schemes perform the same as in Fig. |l(b)| 

Second, we evaluate the performance of the proposed schemes in the same linear network as in the 



previous case while reversing the direction of each flow. The new topology is illustrated in Fig. |2(a)| In this 
scenario, the number of per-hop queues HQ-MWS maintains for each link is the same as the number of 
flows passing through that link. Hence, HQ-MWS is expected to operate differently from PLQ-MWS and 
FLQ-MWS, and achieves different (and potentially poorer) delay performance. All the other simulation 
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(a) A grid network topology 
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(b) Average delay for MWS schemes with e — 0.05 
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(c) Average delay for CSMA schemes with e = 0.005 



Fig. 3. Performance of all the proposed scheduling schemes in a grid network with 16 nodes and 24 links. In Fig. |3(b)[ the 
vertical dotted line A = 0.37 denotes an upper bound for the feasible values of A. 
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settings are kept the same as in the previous case. Fig. |2(b)| shows that all schemes achieve the same 
boundary (i.e., A < 0.5) in this scenario, which again supports our theoretical results on throughput 
performance. However, we observe that HQ-MWS has the worst delay performance, while PLQ-MWS 
and FLQ-MWS achieve substantially better performance. This is because PLQ-MWS and FLQ-MWS 
transmit packets more efficiently and do not waste service as long as there are enough packets at the 
activated link, while the back-pressure algorithm and HQ-MWS maintain multiple queues for each link, 
and may waste service if the activated queue has less packets than the link capacity. HQ-MWS has 
larger delays than the back-pressure algorithm because the scheduling decisions of HQ-MWS are based 
on the shadow queue lengths rather than the actual queue lengths: a queue with very small (or even 
zero) queue length could be activated. This introduces another type of inefficiency in HQ-MWS. Note 
that PLQ-MWS and FLQ-MWS also make scheduling decisions based on the shadow queue lengths. 
However, their performance improvement from a single queue per link dominates delay increases from 
the inefficiency. These observations imply that maintaining per-link queues not only simplifies the data 
structure, but also improves scheduling efficiency and reduces delays. 

Next, we evaluate the performance of all the proposed schemes in a larger grid network with 16 nodes 
and 24 links as shown in Fig. |3(a) where the capacity of each link has been shown beside the link and 
carefully assigned to avoid traffic symmetry. Similar type of grid networks have been adopted in the 
literature (e.g., lH, 161, ll25l ) to numerically evaluate scheduling performance. We establish 10 multi-hop 
flows that are represented by arrows in Fig. |3(a) Again, we consider uniform traffic where each flow 
has independent packet arrivals at each time slot following Poisson distribution with the same mean rate 
A > 0. In this scenario, we can calculate an upper bound of 1/(4/8 + 2/10 + 2) = 10/27 ^ 0.37 for 
the feasible value of A, by looking at the flows passing through node 6, which is the bottleneck in the 
network. 

We choose e = 0.05 for HQ-MWS, PLQ-MWS and FLQ-MWS. Under each scheduUng scheme 
along with the back-pressure algorithm, we measure average packet delays under different offered loads 
to examine their performance limits. Fig. |3(b)| shows that the proposed schemes have higher packet 
delays than the back-pressure algorithm when traffic load is light (e.g., A < 0.15). This is due to the 
aforementioned inefficiency under the proposed schemes: since the scheduling decisions are based on 
the shadow queue lengths rather than the actual queue lengths, queues with very small (or even zero) 
queue length can be activated. However, the effect tends to decrease with heavier traffic load, since 
the queue lengths are likely to be large. The results also show that the proposed schemes consistently 
outperform the back-pressure algorithm when A > 0.15. Note that with e = 0.05, the shadow traffic rate 
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Fig. 4. The impact of the value of e on the scheduling performance. 



vector is outside the optimal throughput region when A > 0.37/ (1 + 0.05) ^ 0.35, however, interestingly, 
the schedules chosen based on the shadow queue lengths can still stabilize the data queues even if 
0.35 < A < 0.37 (which is still feasible). Nevertheless, we later will show that this is not always the 
case. For PLQ-CSMA and FLQ-SMA, similar as in [4|, we choose contention window size W = 48, 
weight function wi{t) = log{0.lQ i{t)), and link activation probability pi = -^7x17^. We choose e = 0.005 



for PLQ-CSMA and FLQ-CSMA, and plot their average delays over offered loads in Fig. |3(c)[ along 
with the back-pressure algorithm. Fig. |3(c)| shows that although PLQ-CSMA and FLQ-CSMA achieve 
the optimal throughput performance, they suffer from very poor delay performance as expected. This 
is due to the long mixing time of the underlying Markov chain formed by the transmission schedules 
im. Note that in the above scenario, FLQ-MWS does not guarantee throughput optimality, since flows 
(5 — > 9 ^ 10 ^ 11 — > 12 — > 8) and (12 ^ 8 — > 7 — )■ 6 — > 5 ^ 9) form a loop. However, the results in 
Fig. |3(b)| suggest that all the schemes, including FLQ-MWS, empirically achieve the optimal throughput 
performance. This opens up an interesting question about throughput performance of FLQ-MWS in 
general settings. 

Finally, we investigate sensitivity of parameter e on the scheduling performance, by runing simulations 



for PLQ-CSMA and FLQ-CSMA with different values of e in the grid network in Fig. |3(a)[ Since 
the performances of PLQ-CSMA and FLQ-CSMA are very close, we report only the results for FLQ- 
CSMA in Fig. m where we plot average packet delays over the offered load A for FLQ-CSMA with 
e = 0, 0.001, 0.005 and 0.05, respectively. The results show that the delay performance generally improves 
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with a larger value of e, in particular under moderate and heavy traffic loads (e.g., A > 0.25). This is 
because a larger value of e leads to more aggressive link activations. However, it can be observed that 
a larger value of e (e.g., e = 0.05) could make the system unstable when the offered load is close to 
the capacity boundary (e.g., A > 0.35). On the other hand, the impact of e becomes marginal under 
light traffic loads (i.e., A is small), as the inefficiency of small queue activation dominates the scheduling 
performance. Interestingly, although we require e be positive in the analysis for throughput optimality, 
the simulation results show that the proposed schemes can empirically achieve the optimal throughput 
performance even when e = 0, leading to much larger delays though. 

VII. Conclusion 

In this paper, we developed scheduling policies with per-hop or per-link queues and a shadow algorithm 
to achieve the overall goal of removing per-flow or per-destination information requirement, simplifying 
queue structure, exploiting only local information, and potentially reducing delay. We showed throughput 
optimality of the proposed schemes that use only the readily available hop-count information, using 
fluid limit techniques via an inductive argument. We further simplified the solution using FIFO queueing 
discipline with per-link queues and showed that this is also throughput-optimal in networks without 
flow-loops. The problem of proving throughput optimality in general networks with algorithms (like 
FLQ-MWS) that use only per-link information remains an important open and challenging problem. 
Further, it is also worthwhile to investigate the problem with dynamic routing and see if per-flow and 
per-destination information can be removed even when routes are not fixed. 

Appendix A 
Proof of Proposition [H 

To begin with, let Q{t) = [Qi^k{t)] and Q{t) = [Qi^k{t)] denote the queue length vector and the 
shadow queue length vector at time slot t, respectively. We use || • || to denote the Li-norm of a 
vector, e.g., = Xlief Sfc=i Qi,k{t)- We let mi^k{i) be the index of the flow to which the 

i-th packet of queue Qi^k belongs. In particular, m; fc(l) indicates the index of the flow to which 
the head-of-Une packet of queue Qi^^ belongs. We define the state of queue Qi ^ at time slot t as 
Qi,k{'t) = ["i«,fc(l)) • • • ,^i,k{Qi,k{t))] in an increasing order of the arriving time, or an empty sequence 
if Qi.k{t) = 0. Then we denote its vector by Q{t) = [Qi,k{t)]. Define = {1,2, •• • , \S\}, and let 
be the set of finitely terminated sequences taking values in Z^. It is evident that Qi^k{t) ^ ^^'^ and 
hence Q{t) G (z|=)l^|x■^'"^ We define X{t) = {Q{t),Q{t), j^A{t)), and then X = {X{t),t > 0) is 
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the process describing the behavior of the underlying system. Note that in the third term of X{t), we 
use j^A{t) instead of jA{t) so that it is well-defined when t = 0. Clearly, the evolution of X forms 
a countable Markov chain under HQ-MWS. We abuse the notation of Li-norm by writing the norm 

of X{t) as \\X{t)\\ = \\Q{t)\\ + niQWIll + rtTTll^(*)in- ^et ;f(^) denote a process X with an initial 
condition such that 

||;fW(0)|| =x. (14) 

The following Lemma was derived in ifTSl for continuous-time countable Markov chains, and it follows 
from more general results in |[26l for discrete-time countable Markov chains. 

Lemma 10 (Theorem 4 of ^15^ ): Suppose that there exist a > and a finite integer T > such that 
for any sequence of processes {^X'^^\xT)^ a; = l,2,---}, we have 

limsup,^^E[ip(-)(xT)||] <l-e (15) 

Then, the Markov chain X is stable. 

Lemma [TO] implies the stability of the network. A stability criterion of type ([TSl l leads to a fluid Umit 
approach |17 | to the stability problem of queueing systems. We start our analysis by establishing the fluid 
limit model as in HI, JlTl. We define another process y = [f, U, Q, U, ^, A, D, P,Q,fl, A, D, , 
where the tuple denotes a list of vector processes. Clearly, a sample path of y^^^ uniquely defines the 
sample path of X^^h Then we extend the definition of y to each continuous time t > as y^^\t) = 

y-)(Ltj). 

Recall that a sequence of functions /„(•) is said to converge to a function /(•) uniformly over compact 
(u.o.c.) intervals if for all t > 0, lim„^oo supo<t'<( \ fn{t') — f{t')\ = 0. Next, we consider a sequence of 
processes {-^y^''^"\xn-)} that is scaled both in time and space. Then, using the techniques of Theorem 4.1 
of ifTTl or Lemma 1 of 111 51 . we can show the convergence properties of the sequences in the following 
lemma. 

Lemma 11: With probability one, for any sequence of processes {^y^^"\xn-)}, where {x„} is a 
sequence of positive integers with x„ — )• oo, there exists a subsequence } with ^ oo as j — > oo 
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such that the following u.o.c. convergences hold: 



"-J 






(16) 


J 




Us,k{t)i 


(17) 


x„.l,k 
J 




ai,k{t), 


(18) 


x„- l,k 


{Xn,t) 


ai,k{t), 


(19) 




{Xnjt) 


Ql,k{t): 


(20) 


x„.^l,k 




Ql,k{t): 


(21) 


x„ l,k 

3 


(Xn.t) 


di,k{t), 


(22) 


_^^(^"3 

x„ l,k 

3 ' 




->■ di,k{t), 


(23) 






{T)dT TTi^k{T)dT, 


(24) 


]^ l<x„.t 
~ JO 

3 




{T)dT Tri^k{'r)dT, 


(25) 


I nXn t 

~ JO 

3 




{T)dT ijl^k{T)dT, 


(26) 


I fXn-t 

~ JO 

3 




{T)dT jliji^k{T)dT, 


(27) 


1 fXn t 

~ JO 

3 


pi--3) 

^l,k 


{T)dT Jlpi^kij)dT, 


(28) 


3 


^l,k 


{T)dT flp^k{T)dT, 


(29) 



where the functions fs,Us,k,ai,k,di^k,qi,k,di^k,di,k,qi,k are Lipschitz continuous in [0, oo). 

Note that the proof of the above lemma is quite standard using the techniques developed in ifTSll . ifTTll . 
[|27l . We provide the proof in Appendix IB] for completeness. 

Any set of limiting functions (/, u, q, vr, ip, a, d,p, q, vr, ip, d, d, p) is called a fluid limit. The family of 
these fluid limits is associated with our original stochastic network. The scaled sequences {^3^^^"^ (xn-)} 
and their limits are referred to as a fluid limit model |fT6ll . Since some of the limiting functions, namely 
fs,Us^k,0'i,kidi^k,Qi,k,di^k,di,k,Qi,k^ are Lipschitz continuous in [0,oo), they are absolutely continuous. 
Therefore, these limiting functions are differentiable at almost all time t G [0, oo), which we call regular 
time. 

Next, we will present the fluid model equations of the system, i.e., Eqs. (I30l)-(|45]). Fluid model equations 
can be thought of as belonging to a fluid network which is the deterministic equivalence of the original 
stochastic network. Any set of functions satisfying the fluid model equations is called a fluid model 
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solution of the system. We show in the following lemma that any fluid limit is a fluid model solution. 
Lemma 12: Any fluid limit {f,u,q, 7r,V', a,d,p, q, tt, V', «> d, p) satisfies the following equations: 



fs{t) = Xst, 

mA'^) = Qi,k{0) + ai^t) - di^t), 

ai,k{t) = fQPi,k{T)dT, 
di,k{t) = f^tpiAT)dT, 



TtliMt) =Pi,k{t) -il^i,k{t), 



m,k{t) 



Pl,k{t) - vrz,fc(t), 
^ -vr/,fc(t))+, 
m^kit) = qi,k{0) + di^t) - di^t), 



if qi,k{t) > 0> 
otherwise. 



ai,k{^) = IoPi;k{'^)dT, 
di,k{t) = ipi^k{'r)dT, 

m%kit) =Pi,k{t) -i'i,k{t), 

Vl,k{t) - T^l,k{'t)i 

lk(o)|| + ||g(o)|| <i, 



Ql,k{^) 



if qi^t) > 0, 
otherwise. 



(30) 
(31) 
(32) 
(33) 
(34) 
(35) 
(36) 

(37) 

(38) 
(39) 
(40) 
(41) 
(42) 

(43) 

(44) 
(45) 



Proof: Note that (l30l) follows from the strong law of large numbers. Eqs. dSTTl-dSS]) and (l38l)- (|4n) are 
satisfied from the definitions. Since each of the limiting functions qi^t) is differentiable at any regular 
time t > 0, (l36l ) is satisfied from (l33l) and (l34l) . by taking derivative of both sides of (IBTT ). Similarly, (l42l) 
is satisfied. Further, ( [36l ) and (l42l) can be rewritten as ( [37] ) and ( |43l ). respectively. Eq. (I44l) is from the 
initial configuration (fT4l) . and (1451 ) is due to the operations of HQ-MWS algorithm. ■ 
Due to the result of Lemma [TOl we want to show that the stability criterion of (fTSl ) holds. Note 
that from system causality, we have ai^t) < tj^s-^ik^s + Z]sSh9s,h(0) for all hnk / G £ and all 
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l<k< L"'^^, for all t > 0. Then, we have 
almost surely, and thus. 



limj^oo ^ 



(46) 



almost surely, for all t > 0. Therefore, it remains to be shown that the fluid limit model for the joint 
system of data queues and shadow queues is stable (Lemma UM . Then, by uniform integrability of the 
sequence {i||Af(^)(a;T)||, x = 1, 2, • • • } it implies that (fTSl) holds. We divide the proof of Lemma [TSl into 
two parts: 1) in Lemma [TSl we show that the sub-system consisting of shadow queues is stable; 2) in 
Lemma [TTl the sub-system consisting of data queues is stable. Before proving Lemmas [15] and [T71 we 
state and prove Lemmas [13] and [16] which are used to prove Lemmas [15] and [T7] respectively. 

The following lemma shows that the instantaneous shadow arrival rate is bounded in the fluid limit, 
and is used to show that the fluid limit model for the sub-system consisting of shadow queues is stable 
under HQ-MWS. 

Lemma 13: For all (scaled) time t > 0, and for all links I G £ and 1 < k < L™^^, with probability 
one, the following inequality holds. 



Pz,fc(i)<(l + e)(E.^i%A, + ij, (47) 

and in particular, 

P;,i(t) = (l + e)E.^/^iAs. (48) 

Proof: We start by stating the following lemma, which will be used to prove Lemma [T3] 
Lemma 14: If a sequence {F{n),n = 1, 2, • • • } satisfies lim„_j.oo F{n) = f, then the following holds. 



F{r) _ f 



< ei, 



Proof: We want to show that, for any ei > 0, there exists an < oo such that 
for all n> N. 

Since lim„^oo F{n) = f, then for any ei > 0, there exists a A^i < oo such that \F{n) — f \ < ^, for 
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all n > Ni. Letting N = max jiVi, iSSilZM |, then for all n > iV, we have 



n 



f 



n n 
Y.r=N,Hr) n-iVi + 1 



n 



n 



+ 



Ni - 1 



ei n - iVi + 1 ei ei 

< V + T + T - 

o n 3 3 



n 



28 



(49) 



Now, we prove Lemma [13] Note that we have 



(50) 



for any t > and for any link / € and 1 < A; < L™^^ due to system causality. 

Since the arrival processes satisfy SLLN of type (O, we obtain from Lemma [14] that with probability 
one, 



lim». 



Z^-r=l -r 



As, for all s G iS. 



(51) 



Note that we will omit the superscript ) of the random variables (depending on the choice of 
the sequence {x„^}) throughout the rest of the proof for notational convenience (e.g., we use Ai^k{i) to 
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denote (t)). Then, for all regular time t > 0, all links I G £ and 1 < k < L™'^^, we have 

rt+5 ^ t_\j_ rt 

/_ ril l^(T\rl,T = lilTII ,r, - 

dt 



-IT Jo PiMVdT = lim^^o — ■ s 



iim lim 

5— s>Oj— >oo 6x 



rii 



= (1 + e) lim lim 

(5-s>Oj-5-oo OXn- 

[(t+d)x„^i E.g,%F.(T)+E.E. Q^.>.(o) 

1^ ^T=\tx„.-\ r 

< (1 + e) lim lim 

= {l + e)yHt,\im lim ^^=' ; • + 



- (1 + e) y lim lim ^ 

l{t+S)x„-\ 

+ (1 + e) lim lim ^l^^i^i^ . i 

t + 6 t\ . ,1 

+(l + e)- 



<(l + e) J]//,%A,limf 

s ^ 

= (1 + e) /?,^,A, + 



where in the last inequality, the first term is from (ISTI ). and the second term is from the fact that 

lk(0)|| + ||g(0)|| < 1 implies lim,-^oo "^""^"^ < 1; and ii) 

r[{t+S)x„^i ^ L(t+5)^-.,J ^ r[(t+S)x„^i 

lim / dr < lim > — < lim / —dr 



log ( i^^l^^^^^^^ ] < lim 



r 



lim log I p I < lim \^ — < lim log I it-^ I 

j->oo \ I tXn^ 1+1 / j^oo J-^ T j-^oo \ \ tXn^ \ J 



r=ltx„^] 

lim > — = log . 

7— s>oo ^ — ' r i 
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Combining i) and ii), we have 

lim. nlim- M^Q^.y-L(*+5)-"J 1 



<lim5^o(i-log *+^^ - ^ 



t ) f> 

where the equality is from the L'Hospital's Rule. 

So far, we have shown (|47] ). Note that when k = 1, Eq. (l50b reduces to Ai^i{t) = Yls&s 
Then, in the above derivation of pi^k{t), the first inequality (which follows from (ISOll ) becomes an equality 
and the right-hand side of this inequality becomes 

(1 + e) lim5_5.o limj 

Hence, we obtain ( [48l ). ■ 
Remark: Lemma [13] holds when the exogenous arrival processes satisfy the SLLN, and the shadow 
arrivals are controlled as in dH). Note that Lemma [13] does not hold for data queues Qi^k, since the data 
arrival processes do not satisfy (jU) due to their dependency on the service of the previous hop queues. 
Lemma [T3] is important to proving the stability of the shadow queues, and implies that in the fluid limit 
model, the instantaneous arrival rate of shadow queues is strictly inside the optimal throughput region 
A* after a finite time. 

Then, in the following lemma, we show that the fluid limit model for the sub-system consisting of 
shadow queues is stablqj under HQ-MWS. 

Lemma 15: The fluid limit model for the sub-system of shadow queues q operating under HQ-MWS 
satisfies that: For any C > 0, there exists a finite Ti > such that for any fluid model solution with 
11^(0) |j < 1, we have that with probability one, 

\\q{t)\\ < C, for all t>Ti, 

for any arrival rate vector strictly inside A*. 

Proof: Suppose A is strictly inside A*, we can find a small e > such that (1 + e)A is strictly inside 
A*. Then, there exists a vector (p G Co{M) such that (1 + e)A < (p, i.e., (1 + e) J2s Sfc ^i,k^s < 4>h 
for all / € £. Let /3 denote the smallest difference between the two vectors, which is defined as /3 = 
min/6£:((/<i - (1 + e) Y.k Hi,k'^s)- Clearly, we have /3 > 0. Let T' be a finite time such that T' > 

^Similar to 1151 . we consider a weaker criterion for the stability of the fluid limit model in Lemma [T5] which can imply the 
stability of the original system from Lemma [Tol 
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(i±f=, then we have (1 + e) (^s Ek fc^s + ^) < 0/- Let c^,,, ^ (1 + e) (e, H^kXs + ^) + 
^'-^^^^^(^-£l5-^°^'^^ Then, we have 

Y.k^i,k = (t>h (53) 

and from (|47]) . we have 

< (1 + e) (Es ^ilfc^^ + T^) < (t>i,k, (54) 

for all regular time t > T' . This implies that the instantaneous arrival rate of shadow queues is strictly 
inside the optimal throughput region A*. 

We consider a quadratic-form Lyapunov function V{q{t)) = ^ Sfc(^i,A,(*))^- It is sufficient to show 
that for any Ci > 0, there exist C2 > and a finite time T* > such that at any regular time t >T* , 
y{(i{t)) > Ci implies ^V{q{t)) < —(2- Since q{t) is differentiable for any regular time t > T', we can 
obtain the derivative of V{q{t)) as 

= ZiZk%k{t)-iPiMt)-<t>i,k) (55) 

where ^V{q{t)) = lim^^o , and the first equaUty is from 

Let us choose Cs > such that V{q{t)) > Ci implies max^gf i</c<imax qi^k{t) > Cs- Then in the final 
result of (|55] |. we can conclude that the first term is bounded. That is, 

J2iY.k%k{^) • {Pl,k{t) - 4>Lk) < -C3mini^fc(0,^fc -Pi,k{t)) 

< -C3^ini^ki<Pi,k - (1 + e)(Es^f,fc^^ + T^)) = -C2 < 0, 
where the second inequality is from ( [54l ). For the second term, since HQ-MWS chooses schedules that 
maximize the shadow queue length weighted rate, the service rate satisfies that 

7r{t) G argmax^^co{M) E« Qi,k'{i){t) ■ (t>i, (56) 

where i) g/,fe-(/)(0 = rciaxkqi,k{t), and ii) TTi{t) = Y^k^iM^) with TTi^k{t) = when qi^k{t) < qi,k'{i){t)- 
This implies that Y.iY.k%k{t) ■ (t>i,k < Y.iY.k%k-{t) ■ (t>i,k = Y.iQi,k-{t) ' 4>i < Y.iQi,k'{t) ■ vr/(t) = 
E/ Ylik'ii^kit) • vr; fc(t), for all (j) G Co(A^), where the first equality and the second inequality are from 
(I53] ) and (l56l ). respectively. Then, we obtain that the second term of (1551 ) is non-positive. This shows that 
y{(i{t)) > Ci impUes ^V(q{t)) < — C2 for all regular time t > T*. Hence, it immediately follows that 
for any C > 0, there exists a finite Ti > T* > such that < C, for all t>Ti. ■ 
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We next present Lemma [T6l that is used to show that the sub-system consisting of data queues is stable 
under HQ-MWS in the fluid Umit model. 

Lemma 16: If data queues qi^ are stable for all / S £" and for all j < k, then there exists a finite 
Tf^ > such that for all regular time t > and for all I G £, we have that with probability one, 

Pi,k+i{t)>{l + e)ZsHlk+i^s. 

The proof follows a similar argument used in the proof for Lemma [T3j and is referred to Appendix [C] 
In the following lemma, using a hop-by-hop inductive argument, we show that the fluid model for the 

sub-system of data queues is stable. 

Lemma 17: The fluid limit model of the sub-system of data queues q operating under HQ-MWS is 

stable, i.e., there exists a finite T2 > such that, for any fluid model solution with 11(7(0)11 < 1, we have 

\\q{t)\\ = 0, for aU t > T2, 

for any arrival rate vector strictly inside A*. 

Proof: We prove the stability of data queues by induction. 
Suppose A is strictly inside A* , the sub-system of shadow queues q is stable from Lemma [15] Let us 
choose sufficiently small C > such that ( < emins A^, then there exists a finite time Ti > such that 
we have < C for ^^y regular time t > Ti. Thus, we have ipi^kii) ^ Pi,k{t) — C from (l42l) . for all 

t >Ti. Hence, for all data queues and all regular time t > Ti, we have 

TTi,k{t)=n,k{t)>Pi,kit)-C^ (57) 

from (Us) and (1411 ). 

Now we show by induction that all data queues are stable in the fluid limit model. 
Base Case: 

First, note that 7rj_i(t) > (1 + e)X]s-^fi-^s — C from ( |48] ) and ( [57] ). Consider a sub-system that 
contains only queue qi^i. From pi^i{t) = Y^s^ii-^s and ([37] ). we have ^qi^i{t) = pi^i{t) — TTi^i{t) < 
— e^^fff^As + C < 0, if qi^i{t) > 0. This implies that the sub-system that contains only qn is stable, 
for all / G £. 
Induction Step: 

Next, we show that, if qij is stable for all I ^ £ and all j < k, then each queue qi^k+i is also stable 
for all / G £, where l<k< L™^=^. 

Since qij{t) is stable for all I ^ £ and all j < k, i.e., there exists a finite > such that qi,j{t) = 
for aU regular time t > T^, then Us,k+i{t) = Us,k{t) + qs,k{^) = ■■■ = Us,i{t) + Y.h<k1s,hiO) = ^st + 
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J2h<k 1s,h{^) foJ" all s G 5 and for all regular time t > T^. Thus, we have ai^k+i{t) = t J2s ^ik+i^s + 
T^s ^i,k+iT.h<k1s,h{^) from <E2l), and pi,k+i{t) = J2sHi,k+i^s from ^ by taking derivative, for 
all / € <S and all regular time t > T^. Then, note that we have > (1 + e) fc+i'^s 

from Lemma [161 Hence, we have vrj > (1 + e) H'j, k+i^s — C from (l57l) . Therefore, we have 

^qi,k+i{t) = Pi,k+i{t) - TTi,k+i{t) < -eEs Hlj^j^^Xs + C < 0, if qi,k+i{t) > 0. This implies that qi^k+i 
is stable for all I G £. 

Therefore, the result follows by induction. ■ 
The following lemma says that the fluid Umit model of joint data queues and shadow queues is stable, 

which follows immediately from Lemmas [15] and [T7] 

Lemma 18: The fluid limit model of the joint system of data queues q and shadow queues q operating 

under HQ-MWS satisfies that: For any C > 0, there exists a finite T2 > such that for any fluid model 

solution with ||g(0)|| + ||(7(0)|| < 1, we have that with probability one, 

\\q{t)\\ + \\m\\<C, foralH>r2, 

for any arrival rate vector strictly inside A*. 

Now, consider any fixed sequence of processes {^X^^\xt), x = 1, 2, • • • } (for simplicity also denoted 
by {x}). By Lemmas [TT] and [TSl we have that for any fixed ^1 > 0, we can always choose a large enough 
integer T > such that for any subsequence of {x}, there exists a further (sub)subsequence {xn^} 
such that 

lim,^oo^(||Q(""^)(x„,T)|| + n|Q(-".)(x„^.r)|n) 

rij 

= \\q{T)\\ + \\q{T)\\ < Ci 
almost surely. This, along with (l46l) . implies that 

\imj^^^\\X'^'"^\xnT)\\ <6 

almost surely, which in turn implies (for small enough ^i) that 

limsup,^^ ip(-)(xr)|| < ^1 4 1 - ^ < 1 (58) 

almost surely. This is because there must exist a subsequence of {x} that converges to the same limit as 
limsup,^^ip(-)(xr)||. 

Next, we will show that the sequence {i||,^(^')(xT)||,x = 1,2, • • • } is uniformly integrable. Note that 
link capacities are all finite (equals one, as we assumed in the system model), then for all time slots 
t > 0, we have that 

Pi^kit) = (1 + < (1 + e)tt%-^, (59) 
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for all I and k. Define a random variable 

e(r) ^ i ((1 + 1^1 • L--)(x + Es Fs^^'H^T)) + Ek T.f=i P^ir) + 2) . 

Note that we have 

ZiZkQtki^T)<x + j:,F.^-\xT), 

and 

a'^i;1{xT)<x + y.,^f^''\xT). 

Then, we have 

l\\x^-\xT)\\ = i(E.E.QS(^r) + rE.EfcO!5(xr)l + Ag(xr))i < G(r), 

and 

E[e(r)] < i ((1 + 1^1 • + Y.S >^sxT) + (1 + 6) Y.I Ek Erlm + a.) + 2) 

< i (x(l + • L--)(l + TE, A,) + (1 + e)xr • 1^1 • L--(|f I + A,) + 2) 

< (1 + 1^1 • L'"^")(l + T^, A,) + (1 + e)r • 1^1 • L°^^"(|^I + A,) + 2 

< oo, 

where the first inequality is from (|59l ) and the assumption on our arrival processes. 

Therefore, it follows from the Dominated Convergence Theorem that the sequence {i||A'(^')(xT)||, x = 
1,2, •••} is uniformly integrable. Then, the almost surely convergence in dSSl ) along with uniform 
integrability implies the following convergence in the mean: 

limsup,^^E[ip(-)(xr)||]<l-e 

Since the above convergence holds for any sequence of processes {^X^^\xT),x = 1,2, - • • }, the 
condition of type ( fTSl) in Lemma \T0\ is satisfied. This completes the proof of Proposition [1] 

Appendix B 
Proof of Lemma [TTI 

First, we prove the convergence and continuity properties for the processes associated with data queues. 
It follows from the strong law of large numbers that ^Fs^"'\xnt) — > Xgt, hence, the convergence 
(fT6b holds, and each of the limiting functions fg is Lipschitz continuous. Also, note that for any fixed 
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< ti < t2, due to finite link capacities (in particular, all equal to one under our unit capacity assumption), 
we have that 

^ (dJ^") {Xnt2) - Dll:\xnh)) <t2-h. (60) 

Thus, the sequence of functions {^Di^ir\xn-)} is uniformly bounded and uniformly equicontinuous. 
Consequently, by the Arzela-Ascoli Theorem, there must exist a subsequence under which (l22l) holds. 
Note that ( [60l ) also implies that each of the limiting functions di^j. is Lipschitz continuous. Recall that 
Us^k{t) denotes the cumulative number of packets transmitted from the {k — l)-st hop to the k-th hop for 
flow s up to time slot t, then convergence dTT] ) holds similarly as (l22l ) for A; > 1, and holds from ([T6l ) 
for k = 1. Hence, convergence ( fTSl ) trivially follows from the definition of Ai^f^{t) and iT7\ . Similarly, 
each of the limiting functions Us^k and a; ^ is Lipschitz continuous. 

Since the sequence {^Q[^"\o)} are bounded by 1 from (fT4l) . there exists a further subsequence (of 
the subsequence already chosen above, and for simplicity still denoted by x„^) such that —Qi ^' (0) — ?> 
qi^k{0)- Hence, convergence (l20l ) trivially follows from the queue evolution equation ^ and convergences 
([TSl l and (I22I) . Also, it follows that each of the limiting functions qi^k is Lipschitz continuous. 

RecaU that = A,fc(t) - A,fc(t - 1) and Pi,k{t) = - - 1), hence, the sequences 

{^Io"''^tk'^(^)dr} and /;"^* ^re identical to the sequences {^I)[;"^\x„^.t)} 

and {-f—A^ respectively. This in turn impUes that the convergences (l26l ) and (l22l ) hold, where 

Jq ''Pi,k{T)dT = di k{t) and J^pi k{T)dT = ai^k{t)- The convergence (l24b follows from an inequality 
similar to (l60l) by applying the Arzela-Ascoli Theorem. 

Using similar arguments, we can prove the results for the processes associated with the shadow queues. 
This completes the proof of Lemma [TT] 

Appendix C 
Proof of Lemma [T6] 

Note that the total number of packets waiting in the previous hops for Qi^k+i at time slot t is no 
greater than Y.h<k Qi,h{t)- Then, we have 

Ak+i{t) > EsH[k+iFs{t) - E^Eh<kQiMt)■ (61) 

Since qi^h is stable for alH G <S and all h < k, there exists a finite > such that X]h<fc Qi,h{t) = 0, 
for all regular time t > Ti. Let 5 > be fixed, and consider all times 1/ € [t, t + (^], where t > Ti. 
Recall that is a positive subsequence for which the convergence to the fluid limit holds u.o.c. For an 
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arbitrary 6 > 0, there exists a large enough j so that 



i h<k 



< 



for all u £ [t,t + 5]. 

Consider time slots T = , + 1, • • • , [xn^ {t + S)\}. Eq. (l62l ) can be rewritten as 

for all time slots r G T. Then for all t > and alH € f , we have 

Pi,k+i{t) = ^ Jo Pi,k+nVdT- = lim^^o ^^^J ■ g 



lim lim 

(5-)-0a;„.-s>oo 



*"= (1 + e) lim lim ^- — 

W Z^r=rix„.] r 

> (1 + e) lim lim 

(5-5.0 a;„j-5>oo OXn 



+ 1™ 



<5-^0 2:„^.^oo [tXnJ - 1 fe„j. 



(1 + e) lim lim 



1 



^ ' T=\tx^;\ 



id) 



t + (5 t 
"1 5 



(1 + e) lim ( ^ • log 



where (a), (6) and (c) are from ( [6T]) and ( [63l) . respectively, and (d) is from dST] ) and (|52l ). 
Since ^ > can be arbitrary, we complete the proof by letting 6^0. 
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Appendix D 

Stability of the Shadow Queues under LQ-MWS 

Similarly to (fT6l)-(|29l). we can establish the fluid limits of the system: {f,u,q,TT,ip,a,d,p,q,7r,tp, a, 
d,p), and we have the following fluid model equations: 

fs{t) = Xst, (64) 

qi{t) =qi{0)+ai{t) -di{t), (65) 

= EsEfc^iWW, (66) 

aiit) = liPiir)dT, (67) 

di{t) = j;,iji{T)dT, (68) 

Mt) < Mt), (69) 

iqi{t)=Pi{t)-Mt), (70) 

JmO-KO, .f.KO>o, ^^^^ 

y {pi{t) - ■Ki{t))+ , otherwise, 

qi{t) = qi{0) + ai{t)-di{t), (72) 

ai{t)= joPl{T)dT, (73) 

di{t) = IoMr)dT, (74) 

Mt) < vrz(t), (75) 

%{t)=pi{t)-Mt): (76) 



d-.,, ' pi{t)-n{t), if 9/(0 >0, 
{pi{t) - TTi{t))+ , Otherwise, 



(77) 



lk(0)|| + ||g(0)|l = l, (78) 
Mt) = Mt)- (79) 

We present a lemma similar to Lemma [13] This will be used to show that the fluid limit model for 
the sub-system consisting of shadow queues is stable under LQ-MWS. We omit its proof since it follows 
the same line of analysis for the proof of Lemma [T3l 

Lemma 19: For all (scaled) time t > and for all links I £ £, v/e have that with probability one, 

Pi{t)<{l + e)(ZsEkHhXs + \). (80) 
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Now, we can show that the fluid Umit model for the sub-system of shadow queues q is stable under 
LQ-MWS. 

Lemma 20: The fluid limit model for the sub-system of shadow queues q operating under LQ-MWS 
satisfies that: For any C > 0, there exists a finite > such that for any fluid model solution with 
||9(0)|| < 1, we have that with probability one, 

<C, for allt>r3, (81) 

for any arrival rate vector strictly inside A*. 

The proof is similar to that of Lemma [15] and is thus omitted. 

Appendix E 
Proof of Proposition [2] 

To show the stability of the network under PLQ-MWS, it is enough to show that the fluid limit model 
of the joint system of data queues and shadow queues is stable. Since the fluid limit model for the 
sub-system of shadow queues is stable from Lemma |20l it remains to show that the fluid model for the 
sub-system of data queues is stable, i.e., it is equivalent to show that all the sub-queues for hop-class k 
packets are stable for each 1 < A; < L™^^. We will prove the stability of sub-queues via a hop-by-hop 
inductive argument. 

Let Qi^k{t) denote the number of packets of hop-class k at Qi at time slot t, and let ^/ ^(t), Di^k{t), 
11/ fc(t), ^;,A:(i) and Pi^k{t) denote the cumulative arrival, cumulative departure, service, departure and 
arrival for packets of hop-class k at Qi, respectively. As before, we establish the fluid limits of the system, 
and obtain (I64l)-(r79l) and the following additional fluid model equations: for all (scaled) time t > 0, 



ai,kii) = EsH!,kUs,k{t), (82) 
ai,k{t) = Jlpi,k{T)dT, (83) 

iqi,k{t)=PiAt)-^i,k{i)^ (84) 
Pl,k{t) - 7r/,fc(t), if qi,k{t) > 
{pi,k{t) - n,k{t))^ ^ otherwise. 
Clearly, packets of hop-class k at link / will not be transmitted under PLQ-MWS unless link / is active 
at time slot t and X]j<fe Qiji'^) < ^^i (Equivalently, Qij{t) = for all j < A; in our setting, since q = 1.), 
i.e., for all 1 < A; < L™'^'', we have 

Ui^kit) = {m)-Ej<kQiAt)y ^ (86) 
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where Ili{t) = 1, if link I is active at time slot t, and 11/ (t) = 0, otherwise. Hence, we have an additional 
fluid model equation as follows: 

nk{t) = Mt)-j:j<k^idit), (87) 

for all 1 < /c < L^^^, and in particular, we have 

vr/,i(t) = 7r/(t), (88) 

for allies and for all t > 0. 

From Lemma |20l the fluid limit model for the sub-system consisting of shadow queues is stable, i.e., 
there exists a finite T3 > such that, for all I £ £ and for all time t > T3, 

Mt) = Mt)>pi{t)- (89) 

Next, we show the stability of sub-queues by induction. 
Base Case: 

We first show that sub-queues qi^i are stable for aU I G £. Note that E[Pi{t)] = (1 + e)^^ > 
(1 + e) ^° '^^ °^ % and following the same line of analysis for the proof of Lemma [161 we show that, 

for aU t > 0. This, along with (l88]l and ([89l), implies that 

vrz,i(t)>(l + e)Es^uA., 

for alU G £: and for all time t > T3. 

Consider the sub-system that only contains sub-queue qi^i, and note that pi^i{t) = H^^Xs, then for 
aU t > Ts, we have = Pi,iit) - ni^i^t) < -ej^s ^i^i^s < 0, if > 0. This implies that the 

sub-system that consists of qi^i is stable, for all / G £. 
Induction Step: 

Next, we show that, if sub-queues qij for all I G £ and all j < k is stable, then each sub-queue qi^k+i 
for all / G is also stable, along with the stability of qij for all I G £ and all j < k. 

Recall that Us,k{t) is the number of packets transmitted from the (A; — l)-st hop to the /c-th hop for 
flow s up to time slot t, and Us^k{t) is its fluid limit. Since qij{t) is stable for all I £ £ and all j < k, 
i.e., there exists a finite T2 > such that qij{t) = for all regular time t > Tg, then Us,k+i{t) = 
Us,k{t) + qs,k{^) = ■■■ = Us,i{t) + Y,h<k1s,h{^) = Xst + J2h<kQs,h{0) for all s G 5, for aU regular time 
t > r|. Thus, for allle£ and for all j <k + 1, we have aij{t) = t Hl-X^ + Hf- g,,,,(0) 
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from ([82] ). and pij{t) = Xls-^fj-^s from (1831 ) by taking derivative, for all / € <S and all regular time 
t > Ti- Hence, from (l84l) and the stability of qij (i.e., ^qi,jit) = 0) for all j < k, we have that for all 

j < k, 

i^i,jit)=PWjit) = T.sH!,\s. (90) 

Note that since 

E[P;(i)] = (1 + e)^ > (1 + e)5i^i<i±i^Lzii±W ^ 

we can obtain that 

M(i)>(l + e)EsE,<fc+i^/jA„ (91) 

following the same line of analysis of Lemma [16] Hence, from dSTl ). (l89l ). ( l90l ) and (1911 ). we have that 
for all j < k, 

n^k+i{t) > (l + e)E.<fc+iA. + 6E.E,<fci^^!,A,. (92) 

This implies that for all time t > T|, |gi,fc+i(t) = Pi,k+iit) - vri,fc+i(t) < -e^, Ej<fe+i ^ijA, < 0, 
if > 0. Hence, we can conclude that qi^k+i is stable for all / € £. 

Now by induction, we can show that all the data queues in fluid limits are stable. With Lemma |20l 
this implies that the fluid limit model of the joint system of data queues and shadow queues is stable. 
Then, we can conclude Proposition |2] following the same arguments used in the proof of Proposition [T] 

Appendix F 
Proof of Lemma [3] 

Recall that C{s) denotes the loop-free route of the flow s. We prove Lemma [3] in a constructive way, 
i.e., for a network where flows do not form loops, we will give an algorithm that generates a ranking 
such that the following statements in Lemma [3] hold: 1) for any flow s € 5, the ranks are monotonically 
increasing when one traverses the links on the route of the flow s from /f to l\c{s)\' '^(^i) ^ 
for all 1 < i < |>C(s)|; and 2) the packet arrivals at a Unk are either exogenous, or forwarded from links 
with a smaller rank. 

We start with some useful definitions. 

Definition 1: Two flows si,S2 G S are connected, if they have common (directed) links on their 
routes, i.e., £(si)P|£(s2) 7^ 0, and disconnected, otherwise. A sequence of flows (ri,--- ,r„) is a 
communicating sequence, if every two adjacent flows Tj and Xj+i are connected with each other. Two 
flows si and S2 communicate, if there exists a communicating sequence between si and S2- 
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Definition 2: Let S{1) C S denote the set of flows passing through link I, and let S{Z) = 1Jzg2 '^(0 
denote the set of flows passing through a set of links Z C £■ A non-empty set of links Z is called a 
component, if the following conditions are satisfied: 

2) Either = 1, or any two flows si,S2 € S{Z) communicate. 

Definition 3: Consider a component Z, a sequence ^ of flows (si, S2, • ■ ■ ,sn) ^ <S{Z), where N > 2, 
is said to form a. fiow-loop, if one can find two links l^" and Zj" for each n = 1, 2, • • • , A^, satisfying 

1) < jn for each 1 < n < A^, 

f = for each n < A^, 

2) ^ ^" 

I /Sn /Si 

An example of a component that contains a flow-loop is presented in Fig. |5(a)[ where the network 
consists of seven links and six flows. The routes of the flows are as follows: C{si) = (1, 2, 3), £(52) = 
(3, 4), £(53) = (4, 5), £(54) = (5, 6), £(55) = (6,7),£(S6) = (7,2). 

Definition 4: A component Z is called a fiow-tree, if Z does not contain any flow-loops. 

Definition 5: Consider a component Z, a link / € 2^ is called a starting link, if there exists a flow 
s' € S{Z) such that Hf^ = 1 and H^j^ = for all other s G S{Z) and all k > 2, i.e., a starting link has 
only exogenous arrivals. Similarly, a link I £ Z is called an ending link, if there exists a flow s" € S{Z) 
such that, ^^f|£(5//)| = 1> and H'lj^ = for all other s G 5(2^) and all k < \C{s)\, i.e., an ending link 
transmits only packets that will leave the system immediately. A path P = {lp,i,lp,2, ■ " Jp,ien(P))^ 
where len{P) denotes the length of path P and Ip^i denotes the i-th hop link of P, is called a. fiow-path, 
if the following conditions are satisfied: 

1) Links Ip^i and lp^ien{P) are the only starting and ending link on the path P, respectively. 

2) Either len{P) = 1, or for each 1 < i < len{P), there exists a flow s such that, Ip^i G C{s) and 
lp,i+i £ 'C(s), i.e., two adjacent links /p j and /p^i+i are on the route of some flow. 

In general, a flow-tree consists of multiple (possibly overlapped) flow-paths. An illustration of flow- 
loop, flow-path, and flow-tree is presented in Fig. [5] It is clear from Definition [3] that, if there exists 
a flow-loop in a component, this component must contain a cycle of links, while the opposite is not 
necessarily true. For example, the components in Figs. |5(b)| and |5(c)| both contain a cycle, while neither 
of them contains a flow-loop. 

^By slightly abusing the notation, we also use (si, S2, ■ • • , sjv) to denote the set of unique elements of the sequence. 
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(a) A component containing a flow- (b) A flow-tree with one flow-path 

loop 




(c) A flow-tree with five flow-paths 

Fig. 5. Examples of different types of components. Links and flows are denoted by dashed lines with numbers and solid lines 
with arrows, respectively. Note that links without data flows are omitted (not numbered), and two numbers labeled beside a 
dashed line stand for two links with opposite directions, e.g., links 1 and 8 in Fig. |5(b)| In Fig. |5(a)| all flows together forms 
a flow-loop (2,3,4,5,6,7), and the component is not a flow-tree. In Fig. |5(b)l the component is a flow-tree and consists of 
one single flow-path: (1,2,3,4,5,6,7,8). In Fig. |5(c)| the component is a flow-tree and consists of five flow-paths: Pi = 
(1, 2, 3, 4, 5, 8, 10), Pa = (1, 2, 6, 11, 12), P3 = (7, 8, 10), P4 = (7, 8, 9, 11, 12) and P5 = (1, 2, 3, 4, 5, 8, 9, 11, 12). 

Now, we describe Algorithm |2l which is used to generate a ranking for a network without flow-loops 
such that the monotone property in Lemma [3] holds. 

Let £{P) denote the set of links belonging to flow-path P. Let T denote a flow-tree, and let V{T) 
denote the set of all flow-paths in T, i.e., ■P(T) = {P is a flow-path | £{P) C T}. Let Pk{T) denote the 
flow-path chosen in the k-th while-loop when running Algorithm [2] for T, and let Vk{T) = Uj<fc Pk{T)- 
Let r{l) denote the rank of link I G T, and let V{1) denote the set of flow-paths passing through link I, 
i.e., V{1) ^ {P G V{T) I / G £{P)}. Let T^il) ^ {V G [J^^^^^^^^^^.^^ £{P) \ r{l') > r{l)} denote the 
set of links that belong to the flow-paths of V{1) f] Vk{T) (i.e., flow-paths that pass through link / and 
are chosen in the j-th while-loop for j < k) and have a rank greater than r{l). 

The details of ranking are provided in Algorithm 1 . In line |2l we do initialization by setting the rank 
of all links of T to —1. In lines 1411211 we pick a flow-path P & V', and assign a rank to each link of 



June 14, 2012 



DRAFT 



43 



Algorithm 2 Rank Assignment 
1: procedure AssignRank(T) 



2: r{l) < 1 for alU € r 

3: V ^ V{T) 

4: while / do 

5: pick a flow-path P 

6: count 1 

7: for 1 < i < /en(P) do 

8: if r(/p,j) = -1 then 

9: i^{^P,i) ^ count 

10: else if r{lp^i) > count then 

11: count ^ '''{Ip^i) 

12: else 

13: for all / G Tk{lp,i) do 

14: r(/) ^ r(/) + {count - r{lp^i)) 

15: end for 

16: '"(^P,i) ^ count 

17: end if 

18: count count + 1 

19: end for 

20: ^ 

21: end while 



22: end procedure 



P starting from link /pi. We may update a link's rank if we already assigned a rank to that link. The 
set of flow-paths V' is updated in line |20l The while-loop continues until V' becomes empty. We set 
count = 1 in line HI and assign a rank to links Ip-i for each 1 < i < len{P). For each link Ip^i, we 
consider the following three cases: 1) r{lp^i) = —1; 2) r{lp^i) > count; 3) < r{lp^i) < count. 
Case 1): link /p^ has not been assigned a rank yet. We set r{lp^i) = count in line|9] 
Case 2): link /p j already has a rank that is no smaller than the current count. In this case, the rank does 
not need an update, and we set count = r{lp^i) in line [TTl 
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TABLE I 

The evolution of the ranking for the flow-tree in Fig. |5(c)| 



Iteration k 


Ranking of links 1 — 12 





(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1) 


1 


(1,2,3,4,5,-1,-1,6,-1,7,-1,-1) 


2 


(1,2,3,4,5,3,-1,6,-1,7,4,5) 


3 


(1,2,3,4,5,3, 1,6,-1,7,4,5) 


4 


(1,2,3,4,5,3,1,6,7,7,8,9) 


5 


(1,2,3,4,5,3,1,6,7,7,8,9) 



Case 3): link Zp j already has a rank that is smaller than the current count. In this case, we update the 
rank of some other links as well as that of link /p j. Specifically, for all the links I € Tk{lp,i), i.e., links 
that belong to the flow-paths in 7^(1) f^VkiT) and have a rank greater than r{lp^i), we increase their 
ranks by count — r{lp.i) in lines [T3]fT5l Then, we update the rank of link lp.i by setting it to count in 
line[l6l 

After considering all three cases, we increase the value of count by 1 in line [TS] 
The intention of this ranking is to assign a rank to each link such that the ranks are monotonically 
increasing when one traverses any flow -path from its starting link. Algorithm |2] may give different ranking 
to a given flow-tree depending on the order of choosing flow-paths. We give two examples for illustration 
as follows. In Fig. |5(b)[ one (and the unique one in this case) example of the ranking for the flow- 



tree is (1,2,3,4,5,6,7,8) for links 1-8. In Fig. 5(c) one example of the ranking for the flow-tree is 
(1, 2, 3, 4, 5, 3, 1, 6, 7, 7, 8, 9) for links 1-12. The evolution of the ranking for the flow-tree in Fig. |5(c)| is 
presented in Table IH where flow-path Pi is chosen in the i-th while-loop, for i = 1,2,3,4,5. 
bmce we assume 

ELI? ' ^z'fc > 1 for all / € ^, a network graph g can be decomposed into multiple 
disjoint components. Clearly, a network with no flow-loops is equivalent to that all the components of the 
network are flow-trees. Without loss of generality, in the rest of the proof, we assume that the network 
that we consider consists of one single component, which is a flow-tree under the condition of Lemma |3] 
The same argument applies to the case with multiple disjoint components. We claim the following lemma 
and provide its proof in Appendix |Gl 

Lemma 21: Algorithm |2] assigns a rank to each link of a flow-tree T such that for any flow-path 
P € '^{T), the ranks are monotonically increasing when one traverses the links of P from Zpi to 
lp^ien{P)^ i.e., r{lp^i) < r{lp^i^i) for all 1 < i < len{P) and for any P G 'P{T). 
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Now, consider any flow s € 5. The statement 1) holds triviaUy for the case of \C{s) \ = 1. Hence, we 
assume that \C{s)\ > 1. It is clear that for any 1 < i < |£(s)|, the links If and lf_^_i must belong to some 
flow-path P G V{S), where £ is assumed to be a flow-tree. Therefore, the statement 1) follows from 
Lemma |2T1 

Note that the packet arrivals at a link are either exogenous or from the previous hop on the route of 
some flow passing through it. Owing to the monotonically increasing rank assignment, it is clear that 
these previous hop links have a smaller rank. Hence, the statement 2) immediately follows from statement 
1). This completes the proof of Lemma [3] 

Appendix G 
Proof of Lemma [IT] 

We want to show that Algorithm |2] assigns a rank to each link of flow-tree T satisfying that r(/p j) < 
r(/p.j+i), for all 1 < i < len{P) and for any P G 'P{T). We use the method of induction. 

Recall that Pk{T) denotes the flow-path chosen in the A;-th while-loop, and Vk{T) = [^j<^k^k{T). 
We denote Pk{T) and VkiT) by Pk and Vk, respectively, whenever there is no confusion. 
Base Case: 

It is trivial for the case of k = I. Since we initialize r{lp-^.i) = —1 for all 1 < i < len{Pi), we should 
have r{lp^^i) = i for all 1 < i < len{Pi) from lines |9l and [T8] of Algorithmic after running the first 
while-loop. 
Induction Step: 

We show that after running the k-th while-loop of Algorithm [2j if 

r{lp,,i) < r{lp^,i+i) for all 1 < i < len{Pj) and for aU j < k, (93) 

then after running the {k + l)-st while-loop the same result holds for all j < k + 1. In other words, 
once Algorithm [2] assigns the ranks for links of a flow-path in a monotonically increasing way, then this 
property does not change afterward. We also prove this induction step using method of induction. 

We first show that if ( [93] ) holds, then after the first iteration (for assigning a rank to link Ip^^^^^i) of 
the {k + l)-st while-loop, ( [93] ) still holds. When we start the {k + l)-st while-loop, we have count = 1, 
and r(Zpj^^j i) must be in one of the following two cases: 1) r{lp^^^^i) = —1 if the rank of link Ip^^-^^i is 
not assigned yet, or 2) r(/p^^j i) > count, otherwise. Then, Algorithm [2] will assign a rank of 1 to link 
in the former case (line [9]), or will not change its rank in the latter case (line [TTI ). Hence, ([93] ) 
still holds. 
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Now suppose that after assigning the ranks of Unks up to Unk Ip^^^^n, which is the n-th hop of the 
flow-path chosen in the {k + l)-st while-loop, we have r(/p^_|_j^m_i) < 'r{lp^^^^^rn) for all 1 < m < n, 
and (|93] ) holds. Then we want to show that after assigning a rank to the next hop Ip^^-^^n+i^ we still 
have both r{lp^^-^.m-i) < Tilp^^^^m) for all m < n + 1, and (|93] ). We show this when n = 2 for ease of 
presentation. One can easily extend the analysis to the case when n > 2. After assigning a rank to link 
^Pfc+i,i> we have count = ?'(/pfc+i,i) + 1 from line [TSl of Algorithm |2] At this moment, the rank of link 
^Pk+u2 is either 1) r{lp^^^^2) = —1, 2) r{lp^^^^2) > count, or 3) < r{lp^^^^2) < count. We discuss the 
three cases as follows. 
Case 1): r(/p,^„2) = -1. 

In this case, since Algorithm |2] sets r{lp^^^^2) to count from line |9l we have r{lp^_^^^2) > ?^('Pfc+i,i)- 
The rank of links of Pj for all j < k is not changed, and ( |93] ) still holds. 
Case 2): r{lp^^^^2) > count. 

In this case, since Algorithm |2] does not change the rank r(/p^^j 2)> we have r{lp^^^^2) > count > 
''(^Pfc+i,i)- The rank of links of Pj for all j < k is not changed, and (|93] ) still holds. 
Case 3): < r(/p^^j^2) < count. 

Note that in this case, we have r(Zp^^^^i) > r{lp^^-^^2) before assigning a new rank to link ^Pfc+i,2- 
Since Algorithm [2] sets r{lpj^^^^2) to count in line [T6l we will have r{lp^^^^2) > i~{lp^_^^^i) = count — 1. 
Now what remains to show is that after the rank update for links of rfc(/pj^^j 2) in lines [T3lfT5l we still 
have r(/p^^j 2) > r{lp^^^^i) and ( |93] ) still holds. 

Recall that T^il) = {I' € UpeP(/) nP/. ^(-^) I ^ '^'^'^J' ^^^^lotes the set of links that belong 

to the flow-paths of 'P{l)f]'Pk{T) (i.e., flow-paths that pass through link / and are chosen in the j-th 
while-loop for j < k) and have a rank greater than r{l). Let = rfe+i(/p^^j^2) Ul'ft+i.s} denote the 
union of rfc+i(/p^^j 2) and {lp^^-^^^2}- Algorithm |2] updates only the rank of the links in Q. by adding the 
rank with count — r{lp^^^^2)- We claim that Ip^^^^i ^ 0, i.e., the rank r{lp^^-^^i) is not changed after the 
update, which implies that r(/p^^j 2) > ?'(^Pfc+i,i) still holds after the update. We prove this claim by 
contradiction. Suppose that Ip^^^^i € il, then there exists a flow-path P' € 'P(^Pfc+i,2) fl ^fc+i ^^^^ that 
^Pfc+i,i5 ^Pfc+1,2 G and link lp^+1,2 appears earlier than ^p^^^,! on the flow-path P'. This implies that 

flow-paths P' and Pk+i form a flow-loop, which contradicts with the definition of flow-tree. 

Next, we want to show that ( |93] ) still holds after the rank update. Note that before the rank update, 
due to (|93] ). two adjacent links Ip^^i and Ip^^i+i satisfy that r{lp^^i) < r{lp^^i^i) for any j < k and any 
i < len{Pj). We want to show that, after the rank update, we still have r{lp^^i) < r{lp.^i^i). We consider 
the following four cases for two adjacent links /p^ j and /p^ j+i. 
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Case i): Ip^^i £ 0, and Ip^^i+i G S^. 

In this case, since Algorithm |2] increases the rank of links Ip.^i and Ip^^i+i by count — r(/p^^_^J^2)^ we 
still have r{lp.^i) < r(/p^ j+i) after the update. 
Case ii): Ip^^i ^ and lpj,i+i ^ ^• 

In this case, since Algorithm |2] does not change the rank of links Zp^ j and Ip.^i^i, we still have 
r{lp^,i) < r{lp.^i+i) after the update. 
Case iii): Ip.^i ^ $7 and /p,,j+i G il. 

In this case, since Algorithm [2] increases the rank of link Ip^^i+i by count — r(Zpj^^j 2) and does not 
change the rank of links lp^,i, we still have r{lp^^i) < r{lp.^i^i) after the update. 
Case iv): Ip.^i G Q and Ip^^i+i ^ ^• 

This is an infeasible case from the definition of and (|93] ) of the previous step. Note that since links Zp^ j 
and Zpj,i+i are two adjacent links on the flow -path Pj, there exists a flow s such that lpj,i, ^Pj,i+i G 
from the definition of flow-path (Definition [5]l, we should have r{lp.^i) < r(Zp^ j+i) before the rank 
update. Hence if Zp^ j G 0,, we should have Zp^^j+i G 0, from the definition of Q,. 

We can show the property of monotonically increasing ranking for Case 3) by combining sub-cases i), 
ii), iii) and iv). Results for Cases 1), 2) and 3) complete the induction step when 7i = 2. One can easily 
extends the analysis to the case when n > 2, and this completes the proof. 

Appendix H 
Proof of Proposition [4] 

We want to show that, a network where flows do not form loops, i.e., all the components are flow-trees, 
is stable under FLQ-MWS for any traffic with arrival rate vector that is strictly inside A*. 

We know from Lemma |3] that, there exists a ranking R{£) such that the monotone property holds. 
Without loss of generality, we assume that the minimum rank is 1, and use r{£) = max^g^; r{l) to denote 
the maximum rank among all the links. We give the following definitions that are used in the proof. 

Definition 6: We divide £ into r{£) disjoint subsets: = {I £ £ \ r{l) = k}, for 1 < A; < r{£). 
Then Rj^ is called the depth-k set, and a link 1^ G Rk is called a depth-k link. 

Recall that the fluid limit model for the sub-system consisting of shadow queues is stable from 
Lemma |20l We show by induction that all data queues are stable. 
Base Case: 

First, Lemma [3] implies that for any li G Ri, its arrivals are exogenous, i.e., Ai^{t) = H^^ iFs{t). 
Following the same line of analysis for the proof of Proposition [T] we can show that TTi^{t) > (1 + 
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^)EsHLi>^s ^nd pi,{t) = EsK,i>^s, then = p^^^t) - 7:i^{t) < -e^.^f^iAs < 0, if qi,{t) > 

0. This imphes that qi^it) is stable, for all h e Ri. 
Induction Step: 

Next, we show that, if qi is stable for all / G Uj<fc^j' then qi^^^ is also stable for all l^+i G Rk+i, 
along with the stability of all qi, for 1 < k < K. 

Lemma [3] implies that for any Z^+i G Rk+i, its arrivals are either exogenous or from certain links of 
Uj<fc^i- Since qi is stable for all / G [jj^^Rj, following the same line of analysis for the proof of 
Proposition [U we can show that there exists a finite time Tg > such that, for all time t >T-^, we have 
vr/,^i(t) > (1 + e) Es:Wi6£(s) ^"'^ Pk+iii) = Es:/.+ie£(s) Therefore, for all time t > X^, we 
have ftqi,+,{t) = Pi,^,{t) - TTi,^,{t) < -eY.s:i,+^&c(s) < 0, if qi,^,{t) > 0. This implies that g/,^, is 
stable for all /fc+i G Rk+i- 

Therefore, the fluid limit model for the sub-sytem of data queues is stable from the induction. With 
Lemma l20l this implies that the fluid limit model of the joint system of data queues and shadow queues 
is stable. Then, we complete the proof following the same arguments used in the proof of Proposition [T] 

Appendix I 
Proof of Lemma |7] 

Given any 7 G (0, 1), suppose that A is strictly inside (1 — 7)A* , then there exists a sufficiently small 
e > such that (l + e)A is strictly inside (1 — 7)A*, and we can find a vector (j) G {l — ^)Co{A4) such that 
(l + e)A < 4>, i.e., (1 + e) Efc H^^^s < (l^i for aU / G Let /3 ^ minie^((/.i - (1 + e) Efc ^^fcA,). 
By definition, we have /3 > 0. Let T' be a finite time such that T' > Then, for all regular time 

t > T', we have 

from Lemma[T9] This implies that the instantaneous arrival rate of shadow queues is strictly inside (1— 7) 
fraction of the optimal throughput region A* . 

Let Wi{qi) = Jq^' gi{y)dy and consider a Lyapunov function V{q{t)) = Wi{qi{t)). It is sufficient 
to show that for any Qi > 0, there exists a (^2 > such that V{q{t)) > Ci implies ^V{q{t)) < —(2, 
for any regular- time t > T' . Since Wi{qiYs, and g^'s are differentiable, for any regular time t > T', we 
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can obtain the derivative of V{q{t)) as 

= Yl 9i{qi{t)) ■ mt) + J] gi{qi{t)) ■ i^i - Ht)) ■ 

Let us choose (^3 > such that V{q{t)) > Ci implies max;g;(t) > ^3- Then following a similar 
argument as in the proof of Lemma [151 for the final result of (|55] |. we can conclude that the first term 
is bounded as follows: 

Ygiiqiit))-ipiit)-<Pi)<-C2<o, 

and that the second term becomes non-positive due to the following. We first note that > from 

V{q{t)) > 0. Then at time slots T = + !,••• , [xn^it + 6)\}, for any Qb > 0, we have 

^ Qb for all time slots r € T with large enough j and small enough 5. From Lemma [6l given 
any 6 G (0, 1), for all time slots r G T, with probability greater than 1 — 9, LQ-CSMA chooses a schedule 
M (r) G M that satisfies 

yZaiiQiir)) ■ Mi{t) > (1 - 7) max V5/(4M) ' (96) 

ie£ las 

Hence, similar as in Chapter 4 of ll24l . from condition with probability greater than 1 — 6, the fluid 
limit T[{t) under LQ-CSMA satisfies 

lat let 



l-"/)Co(M) ~i 



0e{l-7)Co(A^) 

Therefore, V(ci(ty) > (i implies ^V{q{t)) < -(2- This completes the proof. 
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